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Dimension reduction for finite trees in i\ 



Abstract 

We show that every n-point tree metric admits a (1 + e)-embedding into g^^ logn ^ f or every 
O | e > 0, where C(e) < O ((|) 4 log |)J. This matches the natural volume lower bound up to a 

factor depending only on e. Previously, it was unknown whether even complete binary trees 
on n nodes could be embedded in jP tylagT ^ with 0(1) distortion. For complete d-ary trees, our 
\Q construction achieves C(e) < O (js). 
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1 Introduction 



Let T = (V, E) be a finite, connected, undirected tree, equipped with a length function on edges, 
len : E — > [0, oo). This induces a shortest-path pseudometricj^] 

cIt(u, v) = length of the shortest u-v path in T. 

Such a metric space (V, dr) is called a finite tree metric. 

Given two metric spaces (X, dx) and (Y, dy), and a mapping f : X — >■ Y, we define the Lipschitz 
constant of f by, 

d Y (f(x),f(y)) 

/ Lip = SUP — r . 

x^yax dx{x,y) 

An L-Lipschitz map is one for which ||/||Lip < One defines the distortion of the mapping / to be 
dist(/) = ||/||Lip - H/^llLip) where the distortion is understood to be infinite when / is not injective. 
We say that (X, dx) -D-embeds into (Y,dy) if there is a mapping / : X — > Y with dist(/) < D. 

Using the notation £\ for the space R fc equipped with the || • ||i norm, we study the following 
question: How large must k = k(n, e) be so that every n-point tree metric (1 + e)-embeds into l\l 



1.1 Dimension reduction in l\ 

A seminal result of Johnson and Lindenstrauss |JL84j implies that for every e > 0, every n- 
point subset X C £ 2 admits a (1 + e)-distortion embedding into l\, with k = 0(-^jp L ). On the 
other hand, the known upper bounds for l\ are much weaker. Talagrand [Tal90j, following earlier 
results of Bourgain-Lindenstrauss-Milman [BLM89J and Schechtman [Sch87] , showed that every n- 
dimensional subspace X C £ x (and, in particular, every n-point subset) admits a (1 + e)-embedding 
into l\, with k = 0( ral °f n ). For n-point subsets, this was very recently improved to k = 0(n/e 2 ) by 
Newman and Rabinovich [NR10], using the spectral sparsification techniques of Batson, Spielman, 
and Srivastava |BSSQ9| . 

On the other hand, Brinkman and Charikar [BC05] showed that there exist n-point subsets 
X C t\ such that any D-embedding of X into l\ requires k > n n ^/ D ^ (see also [LN04] for 
a simpler proof). Thus the exponential dimension reduction achievable in the £2 case cannot be 
matched for the l\ norm. More recently, it has been show by Andoni, Charikar, Neiman, and Nguyen 
|ACNNlT] that there exist n-point subsets such that any (1 + e)-embedding requires dimension at 
least ni-oa/iogte- 1 )^ Regev | Regll | has given an elegant proof of both these lower bounds based 
on information theoretic arguments. 

One can still ask about the possibility of more substantial dimension reduction for certain finite 
subsets of £\. Such a study was undertaken by Charikar and Sahai [CS02J. In particular, it is an 
elementary exercise to verify that every finite tree metric embeds isometrically into £\, thus the £\ 
dimension reduction question for trees becomes a prominent example of this type. It was showrj^] 
|CS02| that for every e > 0, every n-point tree metric (1 + e)-embeds into £\ with k = 0( log 2 n ). It 
is quite natural to ask whether the dependence on n can be reduced to the natural volume lower 
bound of il(logn). Indeed, it is Question 3.6 in the list "Open problems on embeddings of finite 

1 This is a pseudometric because we may have d(u, v) = even for distinct u, v 6 V. 

2 The original bound proved in [CS02| grew like log 3 n, but this was improved using an observation of A. Gupta. 
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metric spaces" maintained by J. Matousek [Mat] . asked by Gupta, Lee, and TalwaiQ As noted 
there, the question was, surprisingly, even open for the complete binary tree on n vertices. The 
present paper resolves this question, achieving the volume lower bound for all finite trees. 

Theorem 1.1. For every e > and n G N, the following holds. Every n-point tree metric admits 
a (1 + e)-embedding into l\ with k = 0((^) 4 log i logn). 

The proof is presented in Section |3.1[ We remark that the proof also yields a randomized 
polynomial-time algorithm to construct the embedding. 

1.2 Notation 

For a graph G = (V,E), we use the notations V(G) and E(G) to denote the vertex and edge sets 
of G, respectively. For a connected, rooted tree T = (V, E) and x, y G V, we use the notation P xy 
for the unique path between x and y in T, and P x for P rx , where r is the root of T. 

For k G N, we write [k] = {1, 2, . . . , k}. We also use the asymptotic notation A < B to denote 
that A = O(B), and A >c B to denote the conjunction of ^4 < B and < A 

1.3 Proof outline and related work 

We first discuss the form that all our embeddings will take. Let T = (V, E) be a finite, connected 
tree, and fix a root r G V. For each v £ V, recall that P v denotes the unique simple path from r 
to v. Given a labeling of edges by vectors A : E — > we can define tp : V — > R fc by, 

tp(x) = A ( £ )- C 1 ) 

eeS(P„) 

The difficulty now lies in choosing an appropriate labeling A. An easy observation is that if we 
have ||A(e)||i = len(e) for all e G E and the set {X(e)} e& E is orthogonal, then tp is an isometry. Of 
course, our goal is to use many fewer than \E\ dimensions for the embedding. We next illustrate a 
major probabilistic technique employed in our approach. 

Re-randomization. Consider an unweighted, complete binary tree of height h. Denote the tree 
by Th = (Vh, Eh), let n = 2 h+1 — 1 be the number of vertices, and let r denote the root of the tree. 
Let k G N be some constant which we will choose momentarily. If we assign to every edge e G E^, 
a label A(e) G M K , then there is a natural mapping t\ : Vh — > {0, l} Kh given by 

tx(v) = (A(ei), A(e 2 ), . . . , X(e k ), 0, 0, . . . , 0), (2) 

where E(P V ) = {e±, . . . , e^}, and the edges are labeled in order from the root to v. Note that the 
preceding definition falls into the framework of ([!]), by extending each A(e) to a (K/i)-dimensional 
vector padded with zeros, but the specification here will be easier to work with presently. 

If we choose the label map A : Eh — > {0, 1} K uniformly at random, the probability for the 
embedding t\ specified in ^ to have O(l) distortion is at most exponentially small in n. In fact, 
the probability for t\ to be injective is already this small. This is because for two nodes u, v G Vh 

3 Asked at the DIMACS Workshop on Discrete Metric spaces and their Algorithmic Applications (2003). The 
question was certainly known to others before 2003, and was asked to the first-named author by Assaf Naor earlier 
that year. 
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which are the children of the same node w, there is 0(1) probability that t\(u) = t\(v), and there 
are f2(n) such independent events. In Section [2j we show that a judicious application of the Lovasz 
Local Lemma [ EL75| can be used to show that t\ has O(l) distortion with non-zero probability. In 
fact, we show that this approach can handle arbitrary fc-ary complete trees, with distortion 1 + e. 
Unknown to us at the time of discovery, a closely related construction occurs in the context of tree 
codes for interactive communication [Sc h96j . 

Unfortunately, the use of the Local Lemma does not extend well to the more difficult setting 
of arbitrary trees. For the general case, we employ an idea of Schulman [Sch96| based on re- 
randomization. To see the idea in our simple setting, consider Th to be composed of a root r, under 
which lie two copies of T^—i, which we call A and B, having roots and re, respectively. 

The idea is to assume that, inductively, we already have a labeling A^-i : Eh-i — > {0, l}*^ -1 ) 
such that the corresponding map T\ h _ 1 has 0(1) distortion on Th-i- We will then construct a 
random labeling Xh : Eh — > {0, 1} K by using A^-i on the vl-side, and 7r(A^_i) on the .B-side, where 
7r randomly alters the labeling in such a way that T 7r (_\ fe _ 1 ) is simply t \ h _ 1 composed with a random 

isometry of £^ h 1 \ We will then argue that with positive probability (over the choice of n), T\ h 
has O(l) distortion, 

Let tti, 7T2, . . . , vr/j_i : {0, 1} K — >• {0, 1} K be i.i.d. random mappings, where the distribution of tt\ 
is specified by 

7Tl(xi,X 2 , ... ,X K ) = (pi(xi),p 2 (x2), ■ ■ ■ ,p K (x K )) , 

where each pi is an independent uniformly random involution {0, 1} i— > {0, 1}. To every edge 
e G Efi—i, we can assign a height a(e) G {1, 2, . . . ,h — 1} which is its distance to the root. From a 
labeling A : E^-i — > {0, 1} K , we define a random labeling 7r(A) : E^\ — > {0, 1} K by, 

vr(A)(e) = vr a(e) o A. 

By a mild abuse of notation, we will consider 7r(A) : E(B) — > {0, 1} K . 

Finally, given a labeling A^-i : E^-i — > {0, 1} K , we construct a random labeling A^ : Eh — > 
{0, 1} K as follows, 

'(0,0,. ..,0) e = (r,r A ) 
(1,1,..., 1) e = (r,r B ) 
Xh-i(e) e G E(A) 
7r(X h -i)(e) eeE(B). 



A h (e) 



By construction, the mappings T\ h \v (A)\j{r} an d Tx h \v(B)u{r} have the same distortion as T\ h _ 1 . 
In particular, it is easy to check that r 7r a h _ 1 ) is simply r\ h _ 1 composed with an isometry of 

10,1}^-!). 

Now consider some pair x G V(A) and y G V(B). It is simple to argue that it suffices to bound 
the distortion for pairs with m = d,T h (r, x) = dr h {r, y), for m G {1, 2, . . . , h}, so we will assume that 
x, y have the same height in Th- 

Observe that T\ h (x) is fixed with respect to the randomness in ir, thus if we write v = T\ h {x) — 
T\ h (y), where subtraction is taken coordinate-wise, modulo 2, then v has the form 

1, 1, . . . , 1, bi, b 2 ,..., & K ( m -l) 
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where the {bi} are i.i.d. uniform over {0,1}. It is thus an easy consequence of Chernoff bounds 
that, with probability at least 1 — e~ mK / 8 , we have 



\T\ h (x) - T Xh (y)\\i = |M|i > 



k ■ d Th (x,y) 



Also, clearly ||r A ,J| Li p < k. 

On the other hand, the number of pairs x E V(A),y E V(B) with m = dT h (r,x) = dx h (r,y) is 
2 2 ( m_1 ), thus taking a union bound, we have 



(dist(r A J > max{4,dist(r Ah „ 1 )}) < £ 2 2 ^e 



m/i/8 
m=l 



and the latter bound is strictly less than 1 for some k = O(l), showing the existence of a good map 

This illustrates how re-randomization (applying a distribution over random isometries to one 
side of a tree) can be used to achieve O(l) distortion for embedding Th into £^ . Unfortunately, 
the arguments become significantly more delicate when we handle less uniform trees. The full-blown 
re-randomization argument occurs in Section [5] 

Scale selection. The first step beyond complete binary trees would be in passing to complete 
d-ary trees for d > 3. The same construction as above works, but now one has to choose k x log d. 
Unfortunately, if the degrees of our tree are not uniform, we have to adopt a significantly more 
delicate strategy. It is natural to choose a single number n(e) E N for every edge e E E, and then 
put A(e) E ^t^j{0, l} K ( e ) (this ensures that the analogue of the embedding t\ specified in ^ is 
1-Lipschitz). 

Observing the case of d-ary trees, one might be tempted to put 



«(e) 



\ \Tu\ 
\J-u\ 



where e = (u, v) is directed away from the root, and we use T v to denote the subtree rooted at v. 
If one simply takes a complete binary tree on 2 h nodes, and then connects a star of degree 2 h to 
every vertex, we have ft(e) x h for every edge, and thus the dimension becomes 0(h 2 ) instead of 
0(h) as desired. 

In fact, there are examples which show that it is impossible to choose k{u,v) to depend only 
on the geometry of the subtree rooted at u. These "scale selector" values have to look at the 
global geometry, and in particular have to encode the volume growth of the tree at many scales 
simultaneously. Our eventual scale selector is fairly sophisticated and impossible to describe without 
delving significantly into the details of the proof. For our purposes, we need to consider more general 
embeddings of type 0. In particular, the coordinates of our labels A(e) E M fc will take a range of 
different values, not simply a single value as for complete trees. 

We do try to maintain one important, related invariant: If P v is the sequence of edges from the 
root to some vertex v, then ideally for every coordinate i E {1, 2, . . . , k} and every value j E Z, 
there will be at most one e E P v for which A(e)j E [2 J , 2 J+1 ). Thus instead of every coordinate 
being "touched" at most once on the path from the root to v, every coordinate is touched at most 
once at every scale along every such path. This ensures that various scales do not interact. For 
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technical reasons, this property is not maintained exactly, but analogous concepts arise frequently 
in the proof. 

The restricted class of embeddings we use, along with a discussion of the invariants we maintain, 
are introduced in Section 13.21 The actual scale selectors are defined in Section HI 

Controlling the topology. One of the properties that we used above for complete ci-ary trees is 
that the depth of such a tree is 0(log d n), where n is the number of nodes in the tree. This allowed 
us to concatenate vectors down a root-leaf path without exceeding our desired O(logn) dimension 
bound. Of course, for general trees, no similar property need hold. However, there is still a bound 
on the topological depth of any n-node tree. 

To explain this, let T = (V, E) be a tree with root r, and define a monotone coloring of T to 
be a mapping x '■ E — > N such that for every c £ N, the color class x _1 ( c ) is a connected subset of 
some root-leaf path. Such colorings were used in previous works on embedding trees into Hilbert 
spaces |Mat99l rGKL03} lLNP09| . as well as for preivous low-dimensional embeddings into £\ jCS02]. 
The following lemma is well-known and elementary. 

Lemma 1.2. Every connected n-vertex rooted tree T admits a monotone coloring such that every 
root-leaf path in T contains at most 1 + log 2 n colors. 

Proof. For an edge e 6 E(T), let 1(e) denote the number of leaves beneath e in T (including, 
possibly, an endpoint of e). Letting £{T) = max e6 B^(e), we will prove that for £(T) > 1, there 
exists a monotone coloring with at most 1 + log 2 (£(T)) < 1 + log 2 n colors on any root- leaf path. 

Suppose that r is the root of T. For an edge e, let T e be the subtree beneath e, including the 
edge e itself. If r is the endpoint of edges e\, e 2 , . . • , e&, we may color the edges of T ei , T e2 , . . . , T efe 
separately, since any monotone path is contained completely within exactly one of these subtrees. 
Thus we may assume that r is the endpoint of only one edge e±, and then £(T) = £(e±). 

Choose a leaf x in T such that each connected component of T' of T\E(P rx ) has £(T') < £{e\)/2 
(this is easy to do by, e.g., ordering the leaves from left to right in a planar drawing of T). Color 
the edges E(P rx ) with color 1, and inductively color each non-trivial connected component T" with 
disjoint sets of colors from N \ {1}. By induction, the maximum number of colors appearing on a 
root-leaf path in T is at most 1 + log 2 (^(ei)/2) = 1 + log 2 (^(T)), completing the proof. □ 

Instead of dealing directly with edges in our actual embedding, we will deal with color classes. 
This poses a number of difficulties, and one major difficulty involving vertices which occur in the 
middle of such classes. For dealing with these vertices, we will first preprocess our tree by embedding 
it into a product of a small number of new trees, each of which admits colorings of a special type. 
This is carried out in Section [3.11 



2 Warm-up: Embedding complete k-ary trees 

We first prove our main result for the special case of complete /c-ary trees, with an improved 
dependence on e. The main novelty is our use of the Lovasz Local Lemma to analyze a simple 
random embedding of such trees into £\. The proof illustrates the tradeoff being concentration and 
the sizes of the sets {{u, v} C V : dxiu, v) = j} for each j = 1,2,.. .. 

Theorem 2.1. Let Tk t h be the unweighted, complete k-ary tree of height h. For every e > 0, there 
exists a (1 + e)- embedding o/Tfc/j, into (^^ h ^ k ^/ £ \ 
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In the next section, we introduce our random embedding and analyze the success probability for 



a single pair of vertices based on their distance. Then in Section 2.2 we show that with non-zero 
probability, the construction succeeds for all vertices. In the coming sections and later, in the proof 
of our main theorem, we will employ the following concentration inequality [McD98j. 

Theorem 2.2. Let M be a non-negative number, and Xj (1 < i < n) be independent random 
variables satisfying Xi < + M , for 1 < i < n. Consider the sum X = with 

expectation E(X) = X^ILi ^C^i) an( ^ Var(X) = Y17=i Var(-Xi). Then we have, 



F(X - E(X) > A) < exp 



-A 



2(Var(X) + MA/3) 



(3) 



2.1 A single event 

First k, h G N and e > 0. Write T = (V, E) for the tree with root r £ V, and let dr be the 
unweighted shortest-path metric on T. Additionally, we define, 



t 



(4) 



and 



m 



t\\ogk]. (5) 

Let {v(l), . . . , v(t)}, be the standard basis for W t . Let b%, &2j ■ ■ • > b m be chosen i.i.d. uniformly 
over {1, 2, . . . , t}. For the edges e G E, we choose i.i.d. random labels A(e) G W nxt , each of which 
has the distribution of the random vector (represented in matrix notation), 



1 



( v*(bi 



\ v(b m ) 



(6) 



Note that for every e G E, we have ||A(e)||i = 1. We now define a random mapping g : V 
|m(7i-i)xt ag f n ws: We put g(r) = 0, and otherwise, 

/ A( ei ) \ 



9(v) 





V o J 



(7) 



where e%,e2,... ,ej is the sequence of edges encountered on the path from the root to v. It is 
straightforward to check that g is 1-Lipschitz. The next observation is also immediate from the 
definition of g. 

Observation 2.3. For any v £V and u G V(P V ), we have dT(u,v) = \\g(u) — g(v)\\\. 

For m,n G N, and A G W nxn , we use the notation A[i] G M n to refer to the ith. row of A. We 
now bound the probability that a given pair of vertices experiences a large contraction. 
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Lemma 2.4. For C > 10, and x,y G V, 

\g(x) - g{y)\\i < (l-Ce)d T (x,y) 



< k -Cd T (x,y)/2 



(8) 



Proof. Fix x,y £ V, and let r' denote their lowest common ancestor. We define the family random 
variables ,je[m] by setting lij = {%— l)m + j, and then 

^ = \\g(x)[lij\ - g(r')m\i + \\g(y)m - gir'MMi - ll<?0*0fe] - <?(y)[%]||i • (9) 

Observe that if i < dr(r,r') then Xij = for all j G [m] since all three terms in ([9]) are zero. 
Furthermore, if z > min^xfj, x),dx(r, y)) + 1, then again X^- = for all j G [m], since in this case 
one of the first two terms of ^ is zero, and the other is equal to the last. Thus if 

R=[h-l]n [d T (r,r') + 1, min(dx(r, x), dr(r, y))], 

then i ^ R ==>• Xy = for all j £ [m], and additionally we have the estimate, 

d T {x,y) 



\R\ = min(dr(r, x), dx{r, y)) — drir, r) < 

Now, using the definition of g Q, we can write 

\\g(x) -giy)^ = ih^Mj] -air'Wi^h + ||<?(y)[%] - 5(r')[%]||i - 

i6[/i-i],ie[m] 

= || 5 (x)-< 7 (r')||i + ||9(y)-3(r , )lli- E X ^ 



(10) 



d T (x,r') + d T {y,r') 



= d T (x,y)- E 

ie[ft-i],ie[m] 

We will prove the lemma by arguing that, 



ie[A-l]je[m] 



ie[h-l]je[m] 
Xij . 



ie[/i-i]je[m] 



We start the proof by first bounding the maximum of the Xij variables. Since, for every £, we 
have 



\g{x)[i] - gW)[t]\\u WgfrM ~ 9(r'M\\i e 0, - 



1 



777 



we conclude that, 



max 



2 

m 



(11) 



For i G i? and j G [m], using ([6]) and ([7]), we see that (g(x)[£ij] — g{r')[£ij\) = ^^(a) and 
9{y)[^ij) ~ g( r ')[^ij] = hyi.fi)> where a and /3 are i.i.d. uniform over {1, . . . ,t}. Hence, for i G R 
and j G [m], we have 

P[Xij + 0] = \. 
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We can thus bound the expected value and variance of Xij for i £ R and j E [m] using 

2 



E[Xij] < 



tm ' 



and 



Using (10), we have 

h— 1 m 



Var(*«)<^. 



i=l j=l 



ieRje[m] 



i&R 



and 



i=i j=i «ei? j'e[m] ieR 

We now apply Theorem |2.2| to complete the proof: 

'd T {x,y)' 



E X H ^ c 

i£[h-l],j£[m] 



t 



E 

ispi-1] ,je[m] 



<h(x,y) > , c _ Y s( d T (x, y) 



E ^'- E 

vie[h-i]jeH 



< exp 



exp 



exp 



E ^ >k'-u 

i6[ft-l]i?e[m] 

((C-l)d T (x,y)A) 2 



d T (x,y) 



2 {^e[h-i]jeH Var(JTy) + (C - l)(d T (z, y)/t)(£)/3 

' -((C-l)d T (x,y)/t) 2 \ 

2 (2d T (x,y)/(im) + (C - l)(dr(x,j/)/t)(^)/3) ) 



{C-lf 



m 



4(l + (C-l)/3) t 



■ d T (x,y) . 



An elementary calculation shows that for C > 10, we have ^r^n~Y)Ji) — Hence 



c 



E > Ced T {x,y) 

ie[h-l],je[m] 



< exp 

(5 



E x v * c 

ie[h-l],je[m] 

( m d T (x,y) 



d T (x,y) 



2t 

k -Cd T (x,y)/2 



completing the proof. 
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2.2 The Local Lemma argument 

We first give the statement of the Lovasz Local Lemma |EL75j and then use it in conjunction with 



Lemma 2.4 to complete the proof of Theorem 2.1 



Theorem 2.5. Let A be a finite set of events in some probability space. For A £ A, let T(A) C A 
be such that A is independent from the collection of events A \ ({A} U T(A)). If there exists an 
assignment x : A — > (0, 1) such that for all A £ A, we have 

F(A)<x(A) 11 (l-x(B)), 
Ber(A) 

then the probability that none of the events in A occur is at least Y\agA^ ~ ^ 0- 



Proof of Theorem 2.1. We may assume that k > 2. We will use Theorem |2.5| and Lemma 2.4 to 
show that with non-zero probability the following inequality holds for all u, v £ V, 

\\g(u) - giv)^ < (l-Ue)d T (u,v). 

For u,v £ V, let £ uv , be the event {||(7(u) — g(v)\\\ < (1 — 14e) driu, v)}. Now, for u,v E V, 
define 

rr — l.-3d T (u,v) 

Observe that for vertices u,v £ V and a subset V C V, the event £ uv is mutually independent 
of the family {£ u / v i : u',v' £ V'} whenever the induced subgraph of T spanned by V contains no 
edges from P uv . Thus using Theorem 2.5, it is sufficient to show that for all u,v £ V, 

uv) < Xuv \\ (1 - X st ) ■ (16) 

s,t£V: 

E(p st )nE(p uv )^$ 



Indeed, this will complete the proof of Theorem 2.1 



To this end, fix u, v £ V. For e £ E and i £ N, we define the set, 

S e ,i = {(u, v) : u,v £ V, dx(u, v) = i, and e £ E(P UV )}. 
Since T is a fc-ary tree, 

i 

\S e ,i\ < • = i • ^ ( 17 ) 

3=1 
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Thus we can write, 

Xuv ] f (1 - X st ) = X uv ] J ] | J l (1 - Xgt) 

s,teV: eeE(P uv ) i£N (s,t)€S a>i 

E{P at )nE(P uv )^$ 

= k Sd T (u,v) Y[ 11 11 (l-k- 3i ) 
e&E{P uv ) ieN (a,*)6S e>i 

> AT 3 ^"'^ J | ]J(1 _ k 2i {k~' M )) 

e£E(P uv )ieN 

-*■*(■*) n ii(i-f)- 

ee£(P u „)ieN v 7 

For x G [0, |], we have e _2:E <1 — x, and since k > 2, we have A; - * < | for all i 6 N, hence 



n {i-x st )>k-^^ n ri ex p(i?) 

s,teV: e£E(P uv ) iSN ^ 7 

£(p st )n£(.p„,,)^0 

e£E{P uv ) V ' 

>fc- 3 ^(^) J] exp^ 



e&E(P uv ) 



Since fc > 2, we conclude that, 



.r 



= fc-^)exp(" 4 ^ ( "' t,) 

n (i-^t) >k- rdT ^ v \ 



s,teV: 
E(P at )C\E(P uv )^ 



On the other hand, Lemma |2.4| applied with C = 14 gives, 

P[||</(u) ||i < (1 - 14£)dr(u,t,)] < fc- 7dr ^\ 

yielding (16), and completing the proof. 



□ 



3 Colors and scales 

In the present section, we develop some tools for our eventual embedding. The proof of our main 
theorem appears in the next section, but relies on a key theorem which is only proved in Section [5| 
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3.1 Monotone colorings 

Let T = (V, E) be a metric tree rooted at a vertex r £ V. Recall that such a tree T is equipped 
with a length len : E — > [0, oo). We extend this to subsets of edges S C E via len(5) = J2 e es ' en ( e )- 
We recall that a monotone coloring is a mapping x '■ E — > N such that each color class x _1 ( c ) = 
{e £ E : x(e) = c} is a connected subset of some root-leaf path. For a set of edges S C £\ we write 
x(S*) for the set of colors occurring in 5. We define the multiplicity of x by 

Mix) =max|x(^)|. 

Given such a coloring x and c G N, we define, 

len x (c) = len(x _1 (c)), 

and len x (5) = £ ce5 len x( c )> if 5 C N. 

For every 5 G [0, 1] and x,y EV, we define the set of colors 

C x (x, y; 5) = {c: \en(P xy n x _1 (c)) < 6 ■ len x (c)} n ( X (P X ) A X (P y )) . 

This is the set of colors c which occur in only one of P x and P y , and for which the contribution to 
P xy is significantly smaller than len x (c). We also put, 

P x (x,y;5) = \en x (C(x,y;5)) . (18) 

We now state a key theorem that will be proved in Section [5] 

Theorem 3.1. For every e, 5 > 0, there is a value C(e, 8) = 0((^ + log log j) 3 log 1) suc/i t/iai 
i/ie following holds. For any metric tree T = (V, £") and any monotone coloring x '■ E — > N, i/iere 
exists a mapping F : V — > ^^ e,5 ^ logn+M ( x ^ ; gyc/i t/iai /or a// x, y G V, 

(1 - e)d T (x,y) - 5p x (x,y;6) < \\F(x) - F(y)||i < d r (x,y) . (19) 

The problem one now confronts is whether the loss in the p x (x, y; <5) term can be tolerated. In 
general, we do not have a way to do this, so we first embed our tree into a product of a small 
number of trees in a way that allows us to control the corresponding p-terms. 

Lemma 3.2. For every e G (0, 1), there is a number k x 1 suc/j i/iai the following holds. For every 
metric tree T = (V, E) and monotone coloring x '■ E — )■ N, i/iere exzsi fc metric trees T\, T2, . . . , 
mi/i monotone colorings {xi '■ E(Ti) — > N}^ =1 and mappings {fi : V — > V"(Ti)}* =1 suc/i t/iai 
■W(Xi) — ^(x); anc ^ 1^(^)1 — |V| f or a tti ^ anc ^ the following conditions hold for all x,y G V : 

faj We Ziaue, 

i=l 

(b) For all i G [&], we /iai>e 

dr t (fi(x),fi(y)) < (l + e)d T (x,y). (21) 
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(c) There exists a number j £ [k] such that 

2-OH-l) k 



ed T (x,y) > 



k 



(22) 



i=l 



Using Lemma 3.2 in conjunction with Theorem 3.1, we can now prove the main theorem (The- 
orem 



1.1). 



Proof of Theorem \l.l\ Let e > be given, let T = (V, E) be an n- vertex metric tree. Let x '■ E — > N 
be a monotone coloring with M{\) < O(logn), which exists by Lemma 1.2 Apply Lemma 3.2 to 



obtain metric trees T\, . . . ,T& with corresponding monotone colorings Xu ■ ■ ■ iXk an d a mappings 
fi : V -> V(Ti). Observe that Af(x») < O(logn) for each i e [jfe]. 

Let F i :y(T i )-».^ (e)logn be the mapping obtained by applying Theorem 3.1 to Tj and Xi> f° r 
each ie [fc], with <5 = 2~ ( - k+1 \ where C(e) = 0(4 (log ;?))• Finally, we put 



(*k /*)) 



so that F : V -> £°((i) 4l °s r lo § n ). We will prove that F is a (1 + 0(e))-embedding, completing the 
proof. 



First, observe that each Fi is 1-Lipschitz (Theorem 3.1). In conjunction with condition (b) of 
Lemma 



3.2 which says that ||/j||Lip < 1 + e for each i & [k], we have ||-F||Lip < 1 + e. 



For the other side, fix x, y G V and let j E [k] be the number guaranteed in condition (c) of 



Lemma 3.2 Then we have, 



\F(x) — F 



k 

rEll(^ o /0(^)-(^ o /i)(y)lli 



i=l 



£ ((1 - e) cfai(/,(x), /,(y)) - 2-^p Xi (f t (x), / 4 (y); 2~( fe + 1 )) 



9 ^E^- 8 )^^^)'^^ - £ ^(x,y) 



> 



fc 

21, /I 

fc 



1=1 



£(i - e) dr< (/<(*), /t(iO) 



1=1 



A: 

1 + g 
A; 



dr(x,y) -ed T (x,y) 



(l-e) 2 d T (x,y) 



l + e 



dr{x,y) -ed T (x,y) 



> (l-0(e))d T {x,y) 
where in the final line we have used k X -, completing the proof. 



□ 



13 



We now move on to the proof of Lemma 3.2 We begin by proving an analogous statement 
for the half line [0, oo). An M-star is a metric space formed as follows: Given a sequence {aj}^ 
of positive numbers, one takes the disjoint union of the intervals {[0, a\\, [0, 0,2], ■ ■ ■}, and then 
identifies the point in each, which is canonically called the root of the ~R-star. An M-star S carries 
the natural induced length metric ds- We refer to the associated intervals as branches, and the 
length of a branch is the associated number ctj. Finally, if S is an R-star, and x G S \ {0}, we use 
£{x) to denote the length of the branch containing x. We put £(0) = 0. 

Lemma 3.3. For every k G N with k > 2, there exist M-stars Si, . . . , Sfc with mappings 

fi:[0,oo)->Si 

such that the following conditions hold: 

i) For each i G [k], fi(0) is the root of Si. 

ii) For all x,y G [0,oo), \ Ya=x d Si(fi(x) , fi(y)) > (l - |) \x - y\ . 
Hi) For each i G [k], fi is (1 + 2~ k+1 )-Lipschitz. 

iv) For x G [0, 00), we have £{fi{x)) < 2 k ~ l x. 

v) For x G [0,oo), there are at most two values of i G [k] such that 

d Si (fM,f i (x))<2- k £(f i (x)). 



vi) For all x,y G [0, oo), there is at most one value of i G [k] such that fi(x) and fi(y) are in 
different branches of Si and 

2- k (£(f l (x)) + £(f i (y)))<2\x-y\. 



Proof. Assume that k > 2. We first construct M-stars Si, . . . , Sfr. We will index the branches of 
each star by Z. For i G [k], Si is a star whose j'th branch, for j G Z, has length 2 % ^ l+k ^ +1 \ We 
will use the notation (i,j, d) to denote the point at distance d from the root on the jth branch of 
Si. Observe that 0) and (i,j',0) describe the same point (the root of Si) for all j,j' G N. 
Now, we define for every i G [k], a function fi : [0,oo) — > Si as follows: 

/ (i,j, (x - 2 i+k i)/(l - 2 1 - k )) for 2~ l x G [2*^, 2 fc (i+ 1 )-i) j 
~ 1 {i,j,2 i+k ti+V -x) for 2- l x G {2 k ^+ 1 )~\ 2 fe 0' +1 )). 

Condition (i) is immediate. It is also straightforward to verify that 

||/ l ||Lip<(l-2 1 - fc )- 1 <l + 2- fc+1 (23) 

yielding condition (iii). 

Toward verifying condition (ii), observe that for every x G [0, 00) and j G {0, 1, ... ,k — 2} we 
have 

d Si {fi{x),0) >(x- 2^ x ^ /(l - 2 1 ~ k ) >x- 2^2^', 
when i = ([log2 x\ — j) mod k. Using this, we can write 
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k L lo g2 x \ 

£<fe«(/i(*),/i(0))> x ~ <2? 

i=l j=[log 2 zJ-fc+2 

[log 2 zj 

= (Jfc - l)x - ^ 2 J ' 

i=Llog 2 x|-fc+2 

> (fc - l)x - 2L lo sa a;J+i 

> (k - 3)x. (24) 

Now fix x,y G [0, oo) with x <y. If x < y/2, then we can use the triangle inequality, together 
with (23) and (24) to write, 

1=1 1=1 

> (1-3/% - (l + 2 1 " fc )x 

> (1-3/%- (l + l/&)x 

> (1 - 7/%y - x) + 4y//c - 8x/& 

> (l-7/k)(y-x). 

In the case that | < x < y, for j G {0, 1, . . . , k — 3}, we have 

d Si (fi(x), fi(y)) >(y~ x)/(l - 2 1 "*) > y - x, 
when i = (|_log2 xj — j) mod k. From this, we conclude that 

I dsM^l m) > ^ E(y - x) > ^( y - *)> ( 25 ) 

i=l i=o 

yielding condition (ii). 

It is also straightforward to check that 



t(fi(x)) < 2L lo S2 ^J+fc-i < 2 *-i 



x. 



which verifies condition (iv). 

To verify condition (v), note that for x G [0, oo), the inequality (/»(&)} fi(0)) < x/2 can only 
hold for % mod /c G { Ll°g2 X J > U°S2 x \ + 1}> hence condition (iv) implies condition (v). 

Finally we verify condition (vi). We divide the problem into two cases. If x < y/2, then by 
condition (iv), 

£{fi(x)) + £(fi(y)) < 2 fe - 1 (x + y) < 2 k -\2y) < 2 k+l {y - x) . 

In the case that y/2 < x < y, fi(x) and fi(y) can be mapped to different branches of Si only for 
i = [log 2 yj (mod k), yielding condition (vi). □ 

Finally, we move onto the proof of Lemma |3.2| 
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Proof of Lemma 3.2. We put k = [7/e] and prove the following stronger statement by induction 
on \V\: There exist metric trees T\,T2, ■ ■■ ,Tfc and monotone colorings Xi '■ E(Ti) — > N, along with 
mappings /j : V — > V(Tj) satisfying the conditions of the lemma. Furthermore, each coloring Xi 
satisfies the stronger condition for all v G V, 

\Xi(P fi (v))\<\x(Pv)\. (26) 

The statement is trivial for the tree containing only a single vertex. Now suppose that we have 
a tree T and coloring x '■ E — > N. Since T is connected, it is easy to see that there exists a color 
class c G x(E) with the following property. Let 7 C be the path whose edges are colored c, and let v c 
be the vertex of j c closest to the root. Then the induced tree T' on the vertex set (V \ V(7 C )) U {v c } 
is connected. 

Applying the inductive hypothesis to T' and x\e(T') yields metric trees T{,T%, . . . ,T' k with 
colorings x\ : E(T[) -> N and mappings // : V{T') -»■ V(7J). 

Now, let S 1 ! , . . . , Sfc and {gj : [0, oo) — > Si} be the M-stars and mappings guaranteed by Lemma 



3.3 



For each i £ [fc], let 5- be the induced subgraph of Si on the set {gi(dr(v, v c )) : v G ^(7c)}> 
and make 5^ into a metric tree rooted at gi(0), with the length structure inherited from Si. We 
now construct Tj by attaching to T/ with the root of S[ identified with the node fl(v c ). The 
coloring x'% 1S extended to Ti by assigning to each root-leaf path in S[ a new color. Finally, we 
specify functions fi : V — > V(Ti) via 



fl(v) v G V(T>) 

9i{d T {v c ,v)) veV\V(T') 



It is straight forward to verify that (26) holds for the colorings {xi} and every vertex v G V. 



In addition, using the inductive hypothesis, we have |V(T»)| < |V| and M{x) < M(xi) f° r every 



i G [k], with the latter condition following immediately from ( 26 ) and the structure of the mappings 

!/.}• ' ^ 

We now verify that conditions (a), (b), and (c) hold. For x, y G V(T'), the induction hypothesis 
guarantees all three conditions. If both x,y G V(7 C ), then conditions (a) and (b) follow directly 



from conditions (ii) and (iii) of Lemma 3.3 applied to the maps {gi}. To verify condition (c), let 



j G [k] be the single bad index from (vi). We have for all i ^ j, 

p Xi (f l (x),f l (yy,2-^)<2 k+1 d T (x,y). 
Since there are at most two colors on the path between x and y in any Tj, by condition (v) of 



Lemma 3.3 there are at most four values of i G [k] \ {j} such that 
hence 

^J2Pxi(fi(x)Ji(,y)^~ ik+1) ) <^—dT(x,y)<e2 k+1 d T (x,y). 

Since ||/j||Lip is determined on edges (x,y) G E, and each such edge has x,y G V(7 C ) or 
x, y G V(T'), we have already verified condition (b) for all i G [k] and x,y G V. Finally, we verify 
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(a) and (c) for pairs with x G V(T') and y G V(y c ). We can check condition (a) using the previous 
two cases, 



^dT i U(x),/ i (y)) = ^(d T< (/ i (x),/ i ( Wc ))+ C Z Ti (/ i (y),/i(« c )) 

1=1 8=1 

> (1 - e)d T (y, v c ) + (1 - e)d T (x, v c ) 

> (l-e)d T (x,y). 



Towards verifying condition (c), note that by condition (v) from Lemma 3.3, there are at most 
two values of i, such that 

PxiifiWJM; 2 ~ (fc+1) ) - PxMx), fiM 2- (fe+1) ) = Px My)JM, 2~ (fc+1) ) / o. 

By the induction hypothesis, there exists a number j G [k] such that 

ed T (x, Vc ) < - 1 —J2 Pxi (f i ( Vc )J i ( x y,2-^). 

Now we use condition (iv) from Lemma |3.3| to conclude, 

2 -^- E PxM*)> fiiv)\ 2 ~ k ) < £ (^(/iH> /iM; 2- fc ) + p X4 (/i(y), /i(« c );2-' 

/ 2 _(fe+i)\ 

<edT(x,^ c )+ I — —J (2 k - l d T (y,v c )) 

< ed T (x,v c ) + ed T (v c ,y) 
= ed T {x,y) , 

completing the proof. □ 
3.2 Multi-scale embeddings 

We now present the basics of our multi-scale embedding approach. The next lemma is devoted to 
combining scales together without using too many dimensions, while controlling the distortion of 
the resulting map. 

Lemma 3.4. For every e G (0, 1), the following holds. Let (X, d) be an arbitrary metric space, and 
consider a family of functions {/» : X — > [0, l]}iez such that for all x,y G X, we have 

j2m(*)-fi(y)\<™- (27) 

Then there is a mapping F : V — > ^ + ^° g ^ such that for all x,y G X , 

a-e)E 2l \^ - - 2 ^y) < \\FW - F(y)h < E 2i i/*(*o - m\, 

where 

C(x,y)= E 2i (\m-h{y)\-[\m-fi{y)\\) ■ 

i:3j<i 
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Proof. Let k = 2 + [log 1/e] , and fix some so 6 X. For i G [k], define Fj : X — >■ K by, 

Fi(x) = ^2^ +i (/ jfc+i (x) - Wx )) • 



(28) 



It is easy to see that (27) implies absolute convergence of the preceding sum. We will consider the 
map F = F\ © i*2 © • • • © F k : X — > l\. It is straightforward to verify that for every x,y G X, 



Now, for i £ [k], define 



J2 2^(\f lk+l (x) - f jk+i (y)\ - [\f jk+i (x) - f jk+i (y)\\) 



jBt<j 
ftk+i{x)-ftk+i{y)¥=Q 



One can easily check that Yli=i &( x i v) — C( x , y)i thus showing the following for i G [k] will complete 
our proof of the lemma, 

\Fi(x)-Fi(y)\ > (l-e)Y / (2 jk+l \fjk +l (x)-fjk +i (y)\)-2( l (x,y). (29) 

jez 

Toward this end, fix i G [k] and x,y G X. Let S = {j G Z : |/./fc+j(x) — /jfc+i(y)| = 1} 5 an d 
T = { j G Z : < \fj k+ i(x) - fjk+i(y)\ < !}• Clearly we then have, 



\Fi(x) - Fi(y)\ 



£ 2^+ i (/, fc+J (*) - / ifc+i (y)) + ^ 2^ +i (/ jfc+i (x) - /^(y)) 



If SUT = 0, then (29) is immediate. Now, suppose that 5/0, and let c = i + /c-max(5). Observe 



that max(S') exists by (27). 
We then have, 



J2 2jk+ Vjk+i(x)-f jk+i (y)\<2 c + Y, 2k3+l + E 2^1/^(^-/^(2/)! 



jeSuT 

j<max5 



j>max S 1 



<2 C + ^ 2 k ^ + Q(x,y) 

j<max5 

< 2 c + 2-2 fc ( max5 - 1 ) +i + C i (x,y) 
<2 c (l + 2 1 - fc ) + C l (x,y) 

< (l + e/2)2 c + d(x,y). 
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On the other hand, 



\Fi{x) - Fi{y)\ 



> 2C _ 2 k ^- Y, 2 k3+l \fk J+l (x) - f kJ+l (y)\ 



jesur 

j<max S 



j>max5 



Therefore, 



>2 C - Yl 2 ki+i -&(x,y) 

j<max S 

> 2 C - 2 • 2 fc ( maxS " 1 )+ i - ^(x, y) 
>2 c (l-2 1 - k )-d(x,y) 

> (l-e/2)2 c -Q(x,y). 

;i - e) 2 kj+i \f jk+i (x) - f jk+i (y)\ < (1 - e)((l + e/2)2 c + Q(x, y)) 

< (l-e/2)2 c + C 4 (x,y) 

< \F i (x)-F i (y)\+2( i (x,y), 



completing the verification of (29) in the case when S ^ 



In the remaining case when 5 = and T ^ 0, if the set T does not have a minimum element, 
then 

Y,2 kj+ Vk 3+l (x) - f kj+i (y)\ = d(x,y), 



making (29) vacuous since the right-hand side is non-positive. 



Otherwise, let £ = min(T), and write 



\Fi(x) - Fi(y)\ 



Y2 kj+i (fk j+ i(*)-fkj+i(y)) 

3&T 



> 2 £k+ % k+i (x)-f £k+i (y)\ 



Y 2 k ^(f kj+i (x)-f kj+i {y)) 



> 2 ek+i \f ek+i (x)-f ek+i (y)\-ax,y) 



□ 



This completes the proof. 

In Section [5j we will require the following straightforward corollary. 

Corollary 3.5. For every e S (0, 1) and m£N, the following holds. Let (X, d) be a metric space, 
and suppose we have a family of functions {/j : X — > [0, l] m }j g ^ such that for all x,y E X , 

Ynfi(x)-fi(y)h<oo. 
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Then there exists a mapping F : V — > £™ 2+ log *1 such that for all x,y G X, 

(l-e)J2 PWfii*) ~ m\\i) - 2C(x,y) < \\F(x) - FMh < E 2il/iW - /<(»)l|i, 

where 

in 

C(*,y) = E E 2i (l/,(^) fc - - Ll/iW* - fMkU), (30) 

fe=l i:3j<i 

and we have used the notation Xk for the kth coordinate of x £ M. m . 

4 Scale assignment 

Let T = (V, E) be a metric tree with root r £ V, equipped with a monotone coloring \ '■ E — > N. 
We will now describe a way of assigning "scales" to the vertices of T. These scale values will be 
used in Section [5] to guide our eventual embedding. The scales of a vertex will describe, roughly, 
the subset and magnitude of coordinates that should differ between the vertex and its neighbors. 
First, we fix some notation. 

For every c £ x(E), we use 7 C to denote the path in T colored c, and we use v c to denote the 
vertex of 7 C which is closest to the root. We will also use the notation T(c) to denote the subtree 
of T under the color c; formally, T(c) is the induced (rooted) subtree on {v c } U V(T U ) where u G V 
is the child of v c such that xi v c u) = c, and T u is the subtree rooted at u. 

We will write p(v) for the parent of a vertex v £ V, and p(r) = r. Furthermore, we define the 
"parent color" of a color class by p(c) = x( v c,p(v c )) with the convention that x( r > r ) = c 0i where 
Co G N \ x(E) is some fixed element. Finally, we put T(co) = T. 

4.1 Scale selectors 

We start by defining a function k : x{E) U {cq} — > N which describes the "branching factor" for 
each color class, 

\E(T(p(c)))\ 



k(c) 



log 2 



+ 1. (31) 



52 |J5(T(c))| 

Moreover, we define ip : x(E) U {co} — >■ N U {0} inductively by setting (p(co) = 0, and 

<p(c) = K(c) + tp(p(c)), (32) 

for c G x(E). 

Observe that for every color c G x(E), we have, 

V (c)= E K ( c ')< E (l + log 2 '^ff /ff ' ) < M(x) + log 2 (33) 

Next, we use to inductively define our scale selectors. Let 

m(T) = min{len(e) : e G E and len(e) > 0}. 
We now define a family of functions {t^ : V — > N U {0}}j e z- 
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For v e V, let c = x( v iP( v ))i an d put Ti(v) = for i < 



log 2 



2 I A/( X )+log 2 |B| 



and otherwise, 



Ti(v) = min 



d T (v, v c ) - min [d T (v, v c ), EL-oo ^ifa 



(A) 



ce x (E(P v )) 
■ v 

(B) 



(34) 



The value of Tj(u) will be used in Section [5] to determine how many coordinates of magnitude 
>c 2 l change as the embedding proceeds from v c to v. In this definition, we try to cover the distance 
from root to v with the smallest scales possible while satisfying the inequality 

tp(c)>Ti(v)+ ^ n(v d ). 
c'e x (E(Pv)) 

For v G V \ {r}, let c = x{ v iP{ v ))i f° r each i G Z, part (B) of p4| ) for Tj(v c ) implies that 

n(v c ) < (f(p(c)) - ^2 T i(v c >)- 



Hence, 



95(c)- Ti(«c/) = 95(c) - n(v c ) - rj(u c /) 

cex(B(P,)) cex(B(PO) 

> 95(c) - p(p(c)) 

= K (c) 

> 1. 



(35) 



Therefore, part (B) of (34) is always positive, so if Tk(v) = for some k > 



, / m(T) 
10g 2 I M( X )+log 2 |S| 



then Tfc(u) is defined by part (A) of (18). Hence Yl)=-oo ^ T j( v ) — d>T(v,v c ) and the following 
observation is immediate. 



Observation 4.1. For v G V and fc > 

7i(«) = 0. 



log 2 



m(T) 



2 I M( X )+log 2 |B| 



J ; */ Tkiy) = then for all % > k, 



Comparing part (A) of (34) for Ti(v) and Ti+x(v) also allows us to observe the following. 

m(T) 



Observation 4.2. For dGF and A; > 



log 2 



=2 ^M( x )+log 2 |E| 
i/mn or equal to part (B) then for all i > k, Ti(v) = 



, if part (A) in (34) for Tk{v) is less 



4.2 Properties of the scale selector maps 

We now prove some key properties of the maps k, 95, and {t{\. 

Lemma 4.3. For every vertex v G V with c = x( v iP( v ))> the following holds. For all i G Z with 
< ?-\ we have n(v) = 0. 
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Proof. If cIt(v,v c ) = 0, the lemma is vacuous. Suppose now that (1t(v,v c ) > 0, and let k 



lQ S2 \ K ( c ) 



. We have dr(v,v c ) > m(T) and k(c) < log 2 \E\ + 1, therefore 

m(T) 



k > 



log 2 



' 2 VM( X ) + log 2 |S| 

It follows that for i> k, ti(v) is given by ( |34[ ). 

If Tfc(u) = 0, then by Observation 4.1 for all i>k, Ti(v) = 0. 

On the other hand if Tf.{v) ^ then either it is determined by part (B) of ( |34[ ), in which case 

T k{v) = f{c) - ^ T k (v c >) = <p(c) ~ T k (v c ) - ^ T k{Vc')>^{c) ~ ^>{p{c)) = k(c), 

c'e x (E(P v )) c'e x (E(P Vc )) 



implying that 



J2 2 j Tj(v) > K(c)2 k > d T (v,> 



j=-oo 



Examining part (A) of (34), we see that Tk+i(v) = 0, and by Observation 
Alternately, T k (v) is determined by part (A) of (34), and by Observation 
completing the proof. 



4.1 



4.2 



Ti(v) = for i > k. 
Ti(v) = for i > k, 

□ 



The next lemma shows how the values {ri(v)} track the distance from v c to v. 
Lemma 4.4. For any vertex v G V with c = x{ v -,p{ v ))! we have 

oo 

(1t{v,v c )< 2 t T i (v) < 3d T (v,v c ). 

i=— oo 

Proof. If dr(v,v c ) = 0, the lemma is vacuous. Suppose now that cIt(v,v c ) > 0, and let 

k = max{i : Ti(v) ^ 0}. 

By Lemma |4.3[ the maximum exists. 

We have Tk+i{v) = 0, and thus inequality (35) implies that part (A) of (34) specifies Tk+i(v), 
yielding 

k oo 

d T (v,v c )< 2Vi(v)= Yl 21 

Ti(v). 

i=— oo i=— oo 

On the other hand, since Tk(v) > 0, we must have cLt(v,v c ) > Xlt-oo 2Vi(u), and Lemma 



4.3 
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implies that 2 k < 2dr(v,v c ), hence, 

k k-l 



Y 2 l n(v)< Y 2V,(<;) + 2* 

;=— oo £=— oo 

k-l 

< Y 2 ^) + 2 * 



2 k 



2 k 



+ 1 



l= — oo 

fc-1 



fc-1 



= Y, 2 i r i (v) + 2 k +[d T (v,v c )- Y 2 ^(* 

i=— oo \ i= — oo 

<a! T (?;,t; c ) + 2 fc 
< 3dr(t;,u c ). 



□ 



The following lemma shows that for any color c G xC^O the value of t% does not decrease as we 
move further from v c in j c . 

Lemma 4.5. Let u,w £ V be such that c = x( w ,p( w )) = x( u iP( u ))> an d cIt(w,v c ) < dr(u,v c ). 
Then for all i £ Z, we have 

n(w) < Ti(u). 

Proof. First let k be the smallest integer for which, 



d T (w,v c ) - min [d T (w,v c ), Y,j=-oc 2 j Tj(w) 
2 k 



< <P(c) ~ Y T k( v c')- 

ce x (E(P w )) 



This k exists since, by (35), the right hand side is always positive, while by Lemma 4.3, the left 
hand side must be zero for some k 6 Z. 

For i > k, by Observation 4.2 we have, Ti(w) = 0. Therefore, for i > k, we have Tj(u) > Ti(w). 
We now use induction on i to show that for i < k, Ti{u) = r^w), and for % = k, t^u) > Tk{w). 



log 2 



m(T) 



2 1 M( X )+log 2 |£| 



, we have Ti(w) = Ti(u) = 0, which gives us the base case 



Recall that, for i < 
of the induction. 

Now, by definition of k, part (B) of (34) for T}._\(w) is an integer strictly less than part (A), 
hence 



fe-i 



k-2 



Y 2V» = 2 fc - 1 r fc _ 1 M+ Y 2j TjH 



j=-oo 



j=-oo 



< 2 



k-l 



d T {w,V c ) - EL-oo 2jT j(w) 



2 k 



k-l 



k-2 



/e _l / d T{w,Vc) ~ Ej=-oo 2Jr i 
2 fc-l 



< 2 

< dr(w,v. 



w 



-l + Y 2V » 

/ j=-oo 
k-2 

+ y 2 ^» 



J = -0O 



(36) 
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For 



log 2 



m(T) 



2 I M(x)+log 2 \E\ 



< i < k, by (36), and as dr{u,v c ) > <It{w,v c ), we have 



Z— 1 \ 2— 1 / 2—1 \ 

min ( dr(w, v c ), 2^Tj(w) j = 2- 7 t j (u>) = min I d-r(w, w c ), 2 J Vj (w;)). (37) 



i-1 



i-1 



J=-oo 



j=-oo 



By our induction hypothesis for all j < z, Tj(u>) = tj(u), so using (37) we can write, 

d T (w,v c ) -mm ^d T (w,v c ), E 2- ? r ? (u;)^ < d T (u, v c ) - min ^t(m, v c ), E 2- ? r J ('u)^. (38) 



J=-0O 



J = -0O 



Therefore, using (]38h , and the definition of k, for all 



Since xi w >p( w )) = x( u ,p{ u )), f° r all i G Z part (B) of (34) is identical for Tj(u) and Ti(w). 

[log 2 ( jU (x ^- ( log 2 |g| ) ^ * < fc > P art W ° f 



(34) specifies Ti(it) and Ti(w), hence 



Ti(u) = Ti(w) = if(c) - E T i(v c ')- 

c'e x (E(P w )) 



For the case that i = k, part (B) of (34) is identical for Tk(u) and Tk(w), and inequality (38) 



implies that part (A) of (34) for Tk(u) is at least as large as part (A) of p4[ ) for Tk(w), completing 
the proof. □ 

The next lemma bounds the distance between two vertices in the graph based on {r^}. 



Lemma 4.6. Let k > 



log. 



( m(T) 
2 I M(x)+log 2 \E\ 



be an integer. For any two vertices w and u such that 



Tjfc(u) 7^ 0, Tk-\{w) = and x(w,p(w)) = x(u,p(u)), we have 

d T (u,w) > 2 k -\ 

Proof. By Observation |4.1[ Tk (w) = 0. Letting c = x( u ,Pi u ))> by Lemma [4~5| we have dT(v c ,u) > 
dr(v c ,w). Using Lemma |4.5| again, we can conclude that for all i G Z, Tj(u) > Ti(w). Since 



T fe-i('" ; ) = 0, inequality ( |35| ) implies that part (A) of (34) specifies Tk-i{w). Therefore, 

k-2 



d T {w,v c ) < ^ 



w 



i=—oo 
k-2 

< E 2i < u ) 

i=—oo 
/ k-l 



E 2V,(n) -2 fc - 1 r fc _ 1 (n). 



(39) 



Since r^(-u) > 0, using part (A) of (34), we can write 

fc-1 



(40) 
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Observation 4.1 implies that Tfc_i(it) / 0, thus Tk-\(u) > 1, and using (39) and (40), we have 

d T (w,u) = d T (u,v c ) - d T (w,v c ) > 2 k ~ 1 , 
completing the proof. □ 

The next lemma and the following two corollaries bound the number of colors c in the tree 
which have a small value of (p(c). 

Lemma 4.7. For any k G NU {0} ; and any color c G x(E), we have 

#{c' G X (E(T(c))) : <p{<!) - <p(c) = k} <2 k . 

Proof. We start the proof by comparing the size of the subtrees T{d) and T(c) for d G x(£'(T(c))). 

For a given color d G x(E(T(c))), we define the sequence {cj}j g N as follows. We put ci = c' 
and for i > 1 we put Cj = p(cj-i). Suppose now that c m = c, we have 

m— 1 



V?(Cm) - ¥>(ci) = ^ K(ci 



i=l 
m— 1 

> 



Elog /| E (T(c +1 ))l 

8=1 ^ 



|£?(r(ci))l 

This inequality implies that 

\E{T(c))\ < 2^-^\E(T(d))\. 

It is easy to check that for colors a, b G x(E(T(c))) such that (p(a) = ip(b), subtrees T(a) and T(b) 
are edge disjoint. Therefore, for k G NU{0}, summing over all the colors d such that ip(d) — (p(c) = k 
gives 

#{c G X (ff (T(c))) : <p(c ) - <p{c) = k} < ^ = 2 ^ < 2 . 

tp(c / )—ip(c)=k ip(c') — ip(c)=k 

□ 



The following two corollaries are immediate from Lemma 4.7 
Corollary 4.8. For any k G N, and any color c G x{E), we have 

#{d G x(E(T(c))) : ^(c') - <p(c) < k} < 2 k+1 . 

Corollary 4.9. For any color c G x{E), and constant C > 2, we have 

^ 2 -CMc')- v (c)) < 2 2-C 

C '6 X (S(T( C )))\{ C } 
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The next lemma is similer to Lemma 4.6 The assumption is more general, and the conclusion 
is correspondingly weaker. This result is used primarily to enable the proof of Lemma |4.11| 

Lemma 4.10. Let u G V and w G V{P U ) be such that ip(x{u,p(u))) > tp(x( w ,p( w )))- F° r o/Z 
vertices x G V(T U ), and k G Z with 



2 k > 



6 dx(x, w) 



ip(x(u,p(u))) ~ v(x(w,p(w))) J ' 



(42) 



we /wroe Tfc(x) = 0. 



Proof. In the case that dr(x,w) = 0, this lemma is vacuous. Suppose now that dx(x,w) > 0. Let 
ci, . . . , c m be the set of colors that appear on the path P xp i w )i m order from x to and for 

i G [m] , let y« = v Ci . We prove this lemma by showing that if, 



k > log 2 



6 dr{x, w) 
ip(x(u,p(u))) - ip(x(w,p(w)))J ' 



(43) 



then part (A) of ( |34[ ) for t^(x) is zero. 

First note that, <p(x( u iP(, u ))) ~ p(,x( w ,p( w ))) — M(x) + l°g2 l-^l an d dT{x,w) > m(T), hence 



(43) implies 



k > 



log 2 



771 (T) 



M(x) + log 2 |£| 



By Lemma 4.4, we have 

m-2 



m— 2 oo m— 2 

£ 2 fc - 1 r fc _ 1 (y 4 ) < 2ir i(^) ^ 2 3d T ( yi ,y i+ i) = 3 d T (yi, y m _i). (44) 

i=l j=— oo 



1=1 



i=l 



Now, using (42) gives 



V?(ci) - 9j(c m ) > <p(x(u,p(u))) - (p(x{w,p(w))) 
6 cZt(^) w) 



> 



> 



2 k 

Gd T (x,y m -i) 
2 k 



Using the above inequality and (44), we can write 



(45) 



d T (x,yi) = d T {x,y, 



m—l i 



<fr{yi,y-. 



tk-l 



rn— 1 ) 
m-2 



i=l 
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First, note that c m = x(Um-i,p(ym-i))- Now, we use part (B) of (34) for Tk(y m -i) to write 

Tfe-l(ym-l) + 



d T (x,yi) < 




m-2 

E 

i=l 



Tk-liVi) 



^2 T k _i(v d ) 



E 

c'&x(E(P x )) 



r fc _i(f; c / 



(46) 



Therefore, either part (A) of (34) specifies Tk-i(x) in which case by Observation 4.2, Tj(u) = for 
i > k, or part (B) of (34) specifies T k -i{x) in which case by (46) we have, 

r fe _i(x)2 fc - 1 > d T (x, yi ), 

□ 



and part (A) of (|34j) is zero for i > k. 

In Section[5j we give the description of our embedding and analyze its distortion. In the analysis 
of embedding, for a given pair of vertices x, y G V, we divide the path between x and y into subpaths 
and for each subpath we show that either the contribution of that subpath to the distance between 
x and y in the embedding is "large" through a concentration of measure argument, or we use 
the following lemma to show that the length of the subpath is "small," compared to the distance 
between x and y. The complete argument is somewhat more delicate and one can find the details 



of how Lemma 4.11 is used in the proof of Lemma 5.15 



Lemma 4.11. There exists a constant C > such that the following holds. For any c G x(E) 
and v G V(T(c)), and for any e G (0, \\, there are vertices u,v! G V with u ^ v! and dx{u,v) < 
e dr{u,u'), and such that, 

u,u' G {v a : a G x( E { P w c ))} U {v}. 
Furthermore, for all vertices x G V(P U > U ) \ {u'}, for all fcsZ, 

T k ( X ) + = 



2 fc < 



Cdr(u, v!) 
(<p(x(u,p(u))) - ip(x(v c ,p(v c )))) 



Proof. Let r' = v c , and let c\, . . . ,c m be the set of colors that appear on the path P vr i in order 
from v to r', and put c m+ i = x( r \p( r '))- We define yo = v, and for i G [to], yi = v Ci . Note 
that {y , • • • ,2/m} = M U {v a : a G x(E(P VVc ))}, and for i < to, x(yi,p{Ui)) = <H+i- We give a 
constructive proof for the lemma. 

For i G N, we construct a sequence (ctj, i>j) G N x N, the idea being that Py a .,y b . is a nonempty 
subpath P vr ' such that for different values of i, these subpaths are edge disjoint. At each step 
of construction either we can use (a^foj) to find u and u' such that they satisfy the properties of 
this lemma, or we find (aj + i,6j + i) such that bi + \ < 6j. The last condition guarantees that we can 
always find u and v! that satisfy conditions of this lemma. 



27 



We start with a% = m and b± = m — 1. If dx(v,y bl ) < sdT(y ai ,ybi) then 

mi Dm— I 



f(x(ym-i,p(y m -i))) - f(x(r ; ,p(r ; )))J k(c) 



and by Lemma 4.3 the assignment v! = y ai and u = y bl satisfies the conditions of this lemma if 
C > Otherwise, for i > 1, we choose (aj+i,6j+i) based on (aj,6j), and construct the rest of the 
sequence preserving the following three properties: 

i) ¥>(c6i+l) - ^(Ca,+l) > ^(Ca.+l) - V(X(^,P(^))); 

h) d T (y b .,v) > ed T (y h ,y ai ); 
hi) ai> bi. 

Let j G {0, . . . , m} be the maximum integer such that edriyj^ybi) > dx{v,yj). Note that j < 6j, 
and the maximum always exists because yo = v. We will now split the proof into three cases. 

Case I: <p(c j+2 ) - p(c bi+ i) > 2(^(c bi+ i) - cp(c ai +i)). 

In this case by condition (hi), ip(cb i+ i) — ip(c ai+ i) > 0. Hence j + 1 < bi, and we can preserve 
conditions (i), (ii) and (hi) with 

(a i+1 ,b i+1 ) = (bi,j + 1). 

Case II: (f(c j+2 ) - <p{c bi+ i) < 2(<p(c bt+1 ) - f(c ai+ i)) and f{c j+1 ) - <p(c bi+l ) > Q(tp(c bi+ i) - 
v(Pm+i))- 



In this case by (|32j) we have, 

n(c j+l ) = (p(c j+1 ) - (p(c j+2 ) = (<p(cj +1 ) - vK+l)) - (y(cj+2) - ^(cbj+i)). 
Using the conditions of this case, we write 

K { C j + l) = (V(cj+l) - ^(Cfe l + l)) - Mcj+2) - <^(c 6i+ l)) 

> 6(<^(c 6i+1 ) - y(c 0j+ i)) - (v?(c i+2 ) - <-p{c bi+ i)) 

= (2(^(^+1) - ^(coi+l)) + 4(^(c 6i+ i) - ^(co i+ i))) - (y?(c i+2 ) - <p(c 6i +i)) 

> ( 2 Mc6 l+ l) - y(Ca i+ l)) + 2(^(c i+2 ) - (f(c bi+ l))j - [tp{c j+2 ) - (f(c bi+1 )) , 

and by condition (i), 

K(cj +1 ) > ((<p(c bi+1 ) - ^(004+1)) + (^(ca,+i) - ¥>(x(r',p(r'))) + 2((p(c j+2 ) - p(c 6 . + i))J 

- (^(cj+2) - ¥>(0, i+ l)) 

^(c^)-^/,^/))). (47) 



Thus if dxiyj+i, v) > e dT(yj,yj+i), then (aj+i,6j+i) = (j + satisfies condition (i) by (47), 
and it is also easy to verify that it satisfies conditions (ii) and (hi). If dT(yj+i,v) < e dT{yj,yj+i), 
then by ([32]), 

<p(x{yj,P(yj))) = <p(c j+ i) = K(cj+i) + <f(c j+2 ) 
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and by ((471), 



2d T (yj,yj- 



2d T (yj,yj+i) 



Mx{yj,p(yj))) ~ f(x{r',p{r'))))J \K(c j+ i) + f(c j+2 ) - f(x{r',p(r'))) 

dr(yj,yj+i) 



> 



Hence Lemma 4.3 implies that the assignment v! = yj+i and u = yj satisfies the conditions of this 
lemma if C > 



2' 



Case III: tp(cj+i) - y{c bi+x ) < 6(ip(c bt+1 ) - ip(c m+1 )). 



In this case we use Lemma 4.10 to show that the assignment u = yj and v! = y bi satisfies the 
conditions of the lemma. We have 

f(x(yj,p(yj))) - f(x(r',p(r'))) = <p(cj +1 ) - <p(x(r',p(r'))) 

= ~ <P(Cbi+l)) + {<f(c bi +l) ~ f(Ca,+l)) 

+ (y(Coi+l) - f(x{r',p{r')))) 
< 6(<p(c 6i+ i) - <p(c ai+ i)) + (<p(c bi+ i) - (p(c ai +i)) 
+ (vCcoi+l) - f(x{r',p{r')))), 

and by condition (i), 

<f{x(yj,p(yj))) - <f{x{r' ,p(r'))) < 8(<p{c bi+1 ) - <p(c ai+1 )). 
Condition (ii) and the definition of yj imply that, 



dT{yj,yt,) > (1 - e)d T (v,y bl ) > e(l - e)d T {y av Vh) > -^d T {y av y bi ). 



Hence, 



Kl)dT{yj,ybi) 



> 



K<p(.x(.yj,p(yj))) - vixir' ,p{r')))) J \v{cb t +i) - ^(ca l+ i) 



6d T (y bi ,y ai 



and by applying Lemma 4.10 with u = y bi and w = y ai , we can conclude that the assignment u = yj 
and u' = y bi satisfies the conditions of this lemma with C = 96. □ 



5 The embedding 

We now present a proof of Theorem 3.1, thereby completing the proof of Theorem |1.1| We first 
introduce a random embedding of the tree T into £\, and then show that, for a suitable choice of 
parameters, with non-zero probability our construction satisfies the conditions of the theorem. 

Notation: We use the notations and definitions introduced in Section]!} Moreover, in this section, 
for c € x(E) U {x( r >p( r ))}> we use P~ l { c ) to denote the set of colors c' G x(E) such that p(c') = c, 
i.e. the colors of the "children" of c. For m,n G N, and A G M mxn , we use the notation A[i] to 
refer to the ith row of A and ^4[i, to refer to the jth element in the ith. row. 
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5.1 The construction 

Fix i5,e 6 (0, |], and let 

and 



t= \e~ 1 + log \log 2 1/5]], 



m= \t 2 (M(x)+log 2 \E\)]. 



(48) 



(49) 



(See Lemma 5.15 for the relation between e and 5, and the parameters of Theorem 3.1 ). For i G 
we first define the map A, : V — > M. mxt , and then we use it to construct our final embedding. 
For a vertex v G V and c = x( v >P( v ))i let a = Y1c'ex( e (Pv)) ^^(^c')) an d 

d T (v c ,v) - E£-oo 2 £r K«) 



/3 = a + min i Tj(n) 



Note that f3 < m since 



For j G [m] , we define, 



4,0,0. ..,0 



]fa<j<P, 



{ (0,0. ..,0) 



d T (v c ,v)- ( ( E£-oo 2 ^ W ) +(/?-a)4) ,0,0..., 0) if j = + 1 and (3 - a < t 



otherwise. 



(50) 



Observe that the scale selector r, chooses the scales in this definition, and for v £ V and isZ, 
Aj(f) = when Tj(n) = 0. Also note that the second case in the definition only occurs when Ti(v) 
is specified by part (A) of (34), and in that case Yle<i 2 £ ti(v) > d(y, v c ). 

Now, we present some key properties of the map Aj(v). The following two observations follow 
immediately from the definitions. 

Observation 5.1. For v G V and i G Z, each row in Aj('w) has at most one non-zero coordinate. 

Observation 5.2. For v G V and i G Z, Zei a = X)c'ex(-E(-P«)) ^ T i( v ^)' F° r 3 ^ {ci,a + t 2 Ti(v)\, we 
have 

A^)[j] = (0,...,0). 



Proofs of the next four lemmas will be presented in Section 5.2 



Lemma 5.3. For v 6 V , there is at most one i G Z and ai mosi one couple (j, fc) G [m] x [t] snc/t 
tfwrf Ai(v)[j,k] £ {0,1}. 

Lemma 5.4. Let c G x(E), and u,w G ^(7c)\{^c} & e sacra iraat dT(w,v c ) < dT(u,v c ). For all 
i G Z and (j, fc) G [m] x [t], u>e ftawe 



AiH[j,/c] < Ai(u)[j,k}. 
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Lemma 5.5. For c G x(-E'), and u,w £ V(7 C ) \ {v c }, we have 

d T (w,u) = ^2 ||Aj(u) - Ai(io)||i, 



and 



d T (v c ,u) = ^ ||Ai(n)||i. 



(51) 



(52) 



Lemma 5.6. Fore G x(E), u,w G ^(7c)\{^c}, i > j and A; G [m], if both ||Ai («)[&] — Aj(u;)[A;]||i / 
0, and ||Aj(u)[fc] - Aj(w)[fc]||i / 0, f/ien d T (u,w) > . 

Re-randomization. For t £ N, let 7r^ : R* — > R* be a random mapping obtained by uniformly 
permuting the coordinates in R*. Let {o"j}jg[ m ] be a sequence of i.i.d. random variables with the 
same distribution as nt- We define the random variable irt m '■ R mxt — > W nxt as follows, 



\ o- m {r Tl 



The construction. We now use re-randomization to construct our final embedding. For c G x(-^)> 
and i £ Z, the map /j >c : V(T(c)) — > M mx * will represent an embedding of the subtree T(c) at scale 
2 i /t 2 . Recall that, 

V(T(c)) = V( lc )u( J V(T(c'))\R'} 
\ c 'e P -i(c) 

Let {IIj iC ' : i G Z, c' G p _1 ( c )} be a sequence of i.i.d. random variables which each have the 
distribution of 7r f)TO . We define / ijC : V(T(c)) — > R mxt as follows, 

{0 if x = v c , 

A<(x) ifxG V( 7c )\R}, (53) 

Ai(v) + n iiC ,(/ iiC /(x)) if x G V(T(</)) \ R,} for some d G p-^c). 

Re-randomization permutes the elements within each row, and the permutations are indepen- 
dent for different subtrees, scales, and rows. Finally, we define fi = fi )C0 , where cq = x( r jP( r ))- We 



use the following lemma to prove Theorem 3.1 



Lemma 5.7. There exists a universal constant C such that the following holds with non-zero 
probability: For all x, y G V, 



(1 - Ce)d T (x,y) -5p x (x,y;5) < ^ \\fi(x) - fi(y)\\i < d T (x,y) . 



(54) 



We will prove Lemma 5.7 in Section 5.3 We first make two observations, and then use them to 



prove Theorem 3.1 Our first observation is immediate from Observation |5.1| and Observation 5.2 



since in the third case of (53), by Observation 5.2 Aj(^) and n^ c /(/j )C /(a;)) must be supported on 
disjoint sets of rows. 
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Observation 5.8. For any v G V and for any row j G [m], there is at most one non-zero coordinate 
in fi{v)[j}. 



Observation 5.2 and Lemma 5.5 also imply the following. 
Observation 5.9. For any v £ V and u G P v , we have dr(u,v) = Yli^z ll/*(' u ) ~~ fi( v )\\i- 



Using these, together with Corollary 3.5 we now prove Theorem |3.1 



Proof of Theorem 3.1 By Lemma 5.7 there exists a choice of mappings {ffjjigz such that for all 
x,y G V, 

dr(x, y) > ^2 \\9i(x) - 9i(y) \\ > (1 - 0{e))d T (x, y) - 5p x (x, y; 5) . 



We will apply Corollary 



F : V 



Wa+fiogi]) 



3.5 



to the family given by | fi = t -~ L | to arrive at an embedding 



such that G = F/t 2 satisfies, 



d T (x,y) > \\G(x) - G(y)||i > (1 - 0(e))d T (x, y) - 6p x {x, y; S). 



(55) 



Observe that the codomain of fi is 



amxt 



where mt = G((^ +loglog(l)) 3 logn), and the codomain 



of G is R d , where d = 0(log^ + loglog(|)) 3 logn). 

To achieve (55 ), we need only show that for every x, y G V, we have ((x, y) < edx(x, y), where 
C(%,y) is defined in (30). Recalling this definition, we now restate £ in terms of our explicit family 

|/i = %) -We have, 



«*.») = E 



E 



hi(x,y; fci, k 2 ) , 



(56) 



where, 



hi(x,y; h, k 



2 l ft 



(fci,fe2)S[m]x[t] v3j<i 

g 3 {x)[ki M]i^9j (y) [fa to] 



\gi(x)[k 1 ,k 2 ] - g i (y)[k 1 ,k 2 ] 



t 2 t 2 
— i g i (x)[k 1 ,k 2 ] - —gi{y)[ki,k 2 } 



Fix x, y G V. For c G x(E(P X y)), let A c be the induced subgraph on V(P xy ) n V(7 C ), i.e. the 



subpath of P xy where all edges are colored by color c. We have, 

d T (x,y)= Yl len (£(A c ))- 

ce x (E{P xy )) 



(57) 



If we look at a single term in (56), we have 



hi(x,y) k 1} k 2 ) < 



t 2 ' 



(58) 



For u, v G P xy , let 

Si(u,v) = {(ki,k 2 ) G [m] x [t] : hi(u,v;ki,k 2 ) / and 3j < i : gj(x)[k\,k 2 ] / gj(y)[ki,k 2 ]}. 
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Now, notice that if ^{gi(x)[ki, k<z\ — gi(y)[ki, ^2]) is fractional, then there must exist a subpath A c , 

for a color c G x(E(P xy )), with endpoints u c and v c such that ^(gi(u c )[ki, — <?i(i>c)[&i, £2]) is 
fractional too. Hence we have 



C(x,y)< ^ ^ 
c6x(s(p^)) iez 



2i^(n e ,^ c )| 
t 2 



We call X^iez 2 l g '(^ e '" c )l th e contribution of A c , for each color c G x{E{P xy ))- 



We divide the analysis of the paths A c for c G x(E(P xy )) into two cases. For c G x(E(P x )) ^x(E(Py)), 
the vertex t> c is one endpoint of the path A c . Let u c be the other. By Lemma |5.3[ there is at most 
one i G Z and (fci, fo) G [m] x [t] such that hi(u c , v c ; fci, fe) 7^ 0, and 



[J Si(u c ,v c ) 



< 1 



By Lemma 4.3, for all i G Z with dT{u c ,v c ) < 2* , we have Tj(u c ) = 0, and 

[|Ai(u c )||i = ||5i(« c ) - ffi(^c) ||i 



0. 



(59) 



For z < 1 + log 2 (<ir('U c , f c )), by (58) and Lemma [573] we can bound the contribution of A c to £(x, y) 
by, 

2i\Sj{u c ,v c )\ ^ 2* 2d T (u c ,v c ) 



E 



t 2 



t 2 



< ed T (u c ,v c ). 



(60) 



In the case that c ^ xC^C^a;)) Ax(E(P y )), note that there is at most one color in x(E(P xy )) \ 
{x{E(P x ))Ax(E(P y ))). If no such color exists, then by d60]), 



£(x,y)< Yl e\en(E(X c ))^ ed T (x,y). 

ce x (E(P xy )) 

Suppose now that {c} = x(E(P xy )) \ {x{E(P x ))Ax{E(P y ))). Let u,w G V(A C ) be the closest 
vertices to x and y, respectively. For i G Z we will show that if hi(u,w;ki,k2) 7^ 0, then either 
cZt(x, y) > 2 l ~ 2 , or for all j < i, we have (gj(x) — gj(y))[ki, = 0. Then, by Observation 



5.3 



there 



are at most two elements in gi(u) — gi(w) that are not in {0, — ™}, therefore we can conclude 



C(x,y) < 2^ p 



' t 2 

+ E E 

c6 X (S(F !C ))Ax(S(P s )) iSZ 



t 2 



4ed T (x,y)+ ^ elen(A c ) 

cex(s(P«))A X (JS(P»)) 

< 5ed T (x,y). 

Without loss of generality suppose that dT(u,v c ) < dT(w,v c ). If dT(w,v c ) = then the contri- 
bution of A c to C( x ^y) is zero. Suppose now that dT(w,v c ) > 0, and let m w = max{i : Tj(u>) 7^ 0}. 
By Lemma |4.3|the maximum always exists. 
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We will now split the rest of the proof into two cases. 
Case 1: _i(V) = 0. 



In this case by Lemma 4.6 we have dr{u, w) > 2 mw . For (ki, fo) £ [m] x [t], if hi(u, w; k\, fe) 7^ 

2d T (u,w) 2d T (x,y) 



then by ( 50 ) , i < m w and 



2^ 2 mi< 



f 2 



< 



t 2 



< ed T (x,y) . 



Case 2: r mTO _i(V) / 0. 



Let m u = max{i : Ti(u) 7^ 0}. By Lemma 4.5 and as T mw -i{u) 7^ 0, we have m u < m,„ < m u +l. 
Observation 4.2, implies that for all j < m u , 

T j( u ) + Tj(v c >) = (p(c). 

We have m w > m u , and by Observation |4,2[ 

T j{w)+ ^ Tjivd) = Tj (u)+ ^2 Tj(v c/ ) = Lp(c). 

c'e x (E(P w )) c>e x (E(Pu)) 
therefore, by Observation 5.2 for j < m u and k G [i 2 9?(c)] 

\\( 9j (x) - <&(u))[fc]||i = ||fe(y) -5»)[&]||i = 0, 



(61) 



(62) 



and by Observation 5.2 and part (B) of ( |34[ ), for all i € Z, all the non-zero elements of <7i(zi) —gi(w) 
are in the first t 2 ip{c) rows. 

Suppose that there exists k £ [m] such that ||(<?j(-u) — (7j(u>))[fc]||i 7^ 0. Now, we divide the proof 
into two cases again. 

Case 2.1: There exists a j < i, such that ||(5j(x) — gj(u))[A;]||i + \\(gj(y) — gj(w))[k]\\i 7^ 0. 
In this case, there must exist some d E x(E(P x )) ^x{E{Py))i such that 

\\{9jM-9jM) Will 7^0. 



By (53) and (50), we have Tj(u c >) 7^ 0. Inequality (62) implies j > m^, and finally by Lemma 4.3 
d T (x,y) > d T {u d ,v d ) > > 2" 1 "- 1 > 2 m -~ 2 > 2*" 2 . (63) 

Case 2.2: \\( 9j (x) - 9j (u)) [k] \U + \\( 9j (y) - #H)[fc]||i = for all j < i. 

In this case, either for all j < i, \\gj(x)[k] — gj(y)[k]\\i = which implies that for k' G [t], 
(k,k') £ Si(u,w), or \\gj(u)[k] - 9 j{w)[k]\\i / for some j < i. If \\gj(u)[k] - 9 j(w)[k]\\i / for 
some j < i then by Lemma |5.6[ 

d T (x, y) > d T (u, w) > 2" 1 "" 1 > 2 m -~ 2 > 2*" 2 . (64) 

For i > m w we have ||<7i (it) — <7i(u>)||i = 0, therefore in both cases if hi(x, y; k\, ^2) 7^ either 
for all j < i, \\ 9 j(x)[k] - 9 j(y)[k]\\i = or 

T < Ad T (x,y) 



t 2 



t 2 



< 2ed T (x,y). 



□ 
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5.2 Properties of the A, maps 

We now present proofs of Lemmas 5.3- |5.6 



Proof of Lemma 5.3, For a fixed i G Z, by (50) there is at most one element in Ai(v) that takes a 
value other than {0, 

We prove this lemma by showing that if for some % G Z, and (j, k) G [m] x [t], 

2*' 



Ai(u)[3,fc] ^0 



< 2 



then for all i' > i and (j', fc') G [m] x [t], we have Aj/(u)[/, k'] = 0. Let c = xi v -,p( v ))- Using (50), 
we can conclude that 



Since the left hand side is an integer, 

t 2 Ti( V ) > 

and 



dT(Vc,v)-U=-oo2 e Tl(v) 



2*/t 2 



£<i Ki 

- V 21 

> d T (v c ,v). 



Ki 



By part (A) of (34), for i' > i we have Ti>(v) = 0, thus ||Aj/(u)||i = and the proof is complete. □ 



Proof of Lemma 5-4- For i < 



, / m(T) 
i0g 2 \M(x)+\o S2 \E\ 



we have ||Afc(u) 



AfcM 



o. 



Let v be the minimum integer such that part (A) of (34) for t v {w) is less that or equal to part 
(B). This v exists since, by (35), part (B) of (34) is always positive, while by Lemma 4.3, part (A) 
of (34) must be zero for some !/eZ. First we analyze the case when i < v. 



Observation 4.2 implies that part (B) of (34) specifies the value of Ti(w). By Lemma 4.5 



Ti(u) > Ti(w), but the part (B) for Ti(u) is the same as for Ti(w), so we must have Tj(u) = Ti(w), 
and the same reasoning holds for ti(w) for I < i. Using this and the fact that part (A) does not 
define Ti(w), we have 



2V i H + ^2^H 

Ki 



2Vi(«) + 2 M^) < d T (v c , w) < d T ( 



v c ,u) 



Ki 



Therefore, the second case in (50) happens neither for u nor for w, and for i < v we have Aj(n) = 
Ai(w). 

We now consider the case i = v. We have already shown that for I < i, T(_{u) = Te(w), and 
using (50), it is easy to verify that for all (j, k) G [m] x [t], 

Ai(u)[j,k]>Ai(w)[j,k}. 

Finally, in the case that i > v, by Observation |4.2[ we have Ti(w) = 0, and Ai(w)\j, k] = 0. □ 
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Proof of Lemma 5.5. For all i 6 Z, recalling the definition a and /3 in (50) for Aj(-u), we have 

Mvc,v) - Ej=-oo ^(«) 



2Vi 2 



/3 — a = min ^i 2 Tj(u) 
and by definition of Aj(tt) we have, 

||Ai(n)||i =min 2V,(u), d T («, u c ) - ^ 2- J r i ( 



By Lemma 4.4 we have Y^i^'^' T i{ u ) — dr{ u i v c), therefore cIt{vc,u) = Eiez The same 

argument also implies that (It(w,v c ) = J2iez IIA(^)||i- 



Now, suppose that d,T{u,v c ) > d(w,v c ). Then Lemma 5.4 implies that, 

\\Ai(u) - Ai(w)\\i = ||Ai(u)||i - ||Aj(u;)||i = d T (v c ,u) - d T {v c ,w) = d T (w,u). 



□ 



Proof of Lemma 5.6. Without loss of generality suppose that dT(v c ,u) > d^{v c ,w). We have, 

d T {u,w) = ^2 \\^h(u) - A h («;)||i 



> J]||A h (ti)-A h («;)||i 

h=j 

> \\Ai(u) - AiHUi + \\Aj(u) - AjHUi 



(65) 



By Lemma 4.5 we have Tj(w) < Tj(u). If part (B) of (34) is less than part (A), then by (50), for 
all h such that 

t 2 Tj{v c >) <h< t 2 (p(c), 



we have ||Aj(u; 



1 ~ i 5 



And, by Lemma |5. 4 



and Observation 

Hence, part (A) of (34) must specify the value of Tj[w). Observation 4.2 implies that Ti(w) = 



5.2 



for k G Z, Aj(w) = Aj(i 



and by (|50|), we have ||Aj(u>)||i = 0. 



By (50), since ||A» («)[&] ||i > 0, and a from (50) is a multiple of r, for all i |JrJ < h < k we 



have 1 1 A,- (it 



1 - 7^ 



. This implies that, 



\Ai(u) - Ai(w)\\i > ^ I /. - I -/- 



A: 






A; 






^ - 1 - i 2 











Moreover, ||Aj(u>)[£;]||i < fj-, and ([50]) implies that for all A; < /i < t 2 [l + |jj , we have ||Aj(u;)[/i]||i 



0. The same argument also shows that, 



|A i (u)-A i (t t ;)||i > 





k 







k). 



Hence by (65), 



dr{u, w) > 



t 2 -l , 



t 2 



2 j > 2 i_1 . 



□ 
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5.3 The probabilistic analysis 



We are thus left to prove Lemma 5.7 For c G x{E), we analyze the embedding for T(c) by going 



through all d G x{E{T(c))) one by one in increasing order of f(c'). Our first lemma bounds 
the probability of a bad event, i.e. of a subpath not contributing enough to the distance in the 
embedding. 

Lemma 5.10. For any C > 8, the following holds. Consider three colors a G x(E), b G p~ 1 (a), 
and c G x{E{Puv b )) for some u G V(T(6)). T/ien /or every u) G V(T(a)) \ F(T(o)) ; we have 



3xGV(P WVa ) :^||/i,a(x)-/i )a (u)||i < (1-Ce) d T (u,v c ) + E || /i.aOc) - /i,a(V) ||i I 



< 



1 



[log 2 1/51 



exp 



-(C/(e20 +2 ))£^u,v c )), 



(66) 



where /3 = max{i : 3y G -Pu^ c \{vc} 5 r i(y) 7^ 0}. (See Figure^ for position of vertices in the tree.) 



w 




urn 



Figure 1: Position of vertices corresponding to the statement of Lemma 5.10 



Proof. Recall that W nx is the codomain of fi a . For i G Z, and j G [m], and z G V(P luV(l ), let 



s ij(*0 = fi,a( z )\j] - fi,a( v c 
We have, 



+ 



fi,a{v c )[j] ~ fi,a( u ) 



fiA z )[j] - fiA u )[j] 



E HW") " -MOIIi + E ll/i,-(«c) - = E - + E s ^)- 

iez iez iez iez,je[m] 
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By Observation |5~9} we have d T (u,v c ) = J2iez \\fi,a( u ) ~ fi,a(v c )\\i, therefore 



d T {u,V c ) - ^ S ij( Z ) = H ~ fiA U )h ~ ^2 Wf*A Z ) ~ fiA V c)\\l- 



(67) 



Let £ = {f i>d : d G p _1 (a)}. We define F £ [-] 



£] . In order to prove this theorem, we bound 



ieZ,je[m] 



U,Vr 



We start by bounding the maximum of the random variables s« . 

For i > f3 we have Aj(u) = Aj(t> c ), hence fi >a (u) = fi,a(v c ). Using the triangle inequality for all 
for all i G Z, j G [m] and z G P WVa , 



Sij{z) <2\\f i , a (v c )\j]-fi, a (u)\j}\\l, 
Hence for all i G Z and j G [m] by Observation |5.8[ 

2 /3+i 



avW<2||/^(u c )b1-/i >a (tt)[7]||i< 



t 2 



(68) 



(69) 



First note that, if z is on the path between Vb and v a then by Observation 5.9, Sij(z) = 
Observation 



5.2 



and @ imply that if ||/i, a («)|>1 - /i, a (Ob']||l / then ||/i, a (<)[j]||i = 0. 
From this, we can conclude that Sij(z) ^ if and only if there exists a k G [t] such that both 



fi,a(u)\j,k] - fi,a(v c )[j,k] / and fi, a {z)\j,k] / 0. Since by Lemma 5.4, for all i G Z, j G [m 



and A; G [i], we have fi, a (w)[j, k] > fi, a (z)[j,k], we conclude that for z £ P WVa if Sjj(z) / then 
Sij(w) / 0. 

Now, for i G Z and j G [m], we define a random variable 



X. 



if S{j(w) = 0, 

2||/i,a(«)[i] " /i,«(l>c)[?']l|l if SiiH / o. 



(70) 



Note that since the re-randomization in (53) is performed independently on each row and at each 



scale, the random variables {X^ : i G Z, j G [m]} are mutually independent. By (68), for all z G 
P-wvai we have Sij(z) < Xij, and thus 



3xGV(P WVa ): Yl Sij(x)>Ced T { 

ieZje[m] 



U,V, 



< 



y~] > Ced T (u,v c ) 

i£Z,j£[m] 



(71) 



As before, for X^ to be non-zero, it must be that k G [t] is such that fi, a (w)\j,k] ^ and 



fi, a ( u)\j , k] — fi, a (v c )[j , k] 7^ 0. Since w ^ V(T(b)) with the re-randomization in (53) and Observa- 



tion 



5.8 



this happens at most with probability j, hence for j G [m], and i€Z, 



P£[|IAaHb] - /Mirth + \\fiM\j] - A«(«)b']lli - IIAaHbl - AaWbllli + o] 



< 
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This yields, 



E[Xij | £]<- {2\\fM\j\- kaiycMh) 



(72) 



Now we use ( 69 ) to write 



1 2 l3+2 
Var(X - | £) < 7 (2\\f i>a (u)\j} - fM\j]h) 2 < II /i,a («)[?] - fiA^Mb 



and use Observation 5.9 in conjunction with (72) to conclude that 



E 



E x n I £ 

iez,je[m] 



< E t ll/*(«c)[7] - /i(«)[7]||i = 7 dr(t; ei u) I 



»eZj'e[m] 



and 



E Var(^|f)< E 



2 /3+2 



ll/i(v c )b]-/i(«)b] 



^3" 



d T (v c ,u). 



(73) 



(74) 



iezje[m] ieZje[m] 

Define M = max{Xjj — E[Xy | £] : i 6 Z,j € [ti]}- We now apply Theorem 2.2 to complete the 
proof: 



E 

iezje[m] 



d T (u,v c ) 



E .V, ; 2 ' /,! / "-' V) >(C 2)'' 

«ez,je[m] 



E *«- E 

i£Z,j'6 [m] 



E I ^ 

iez,je[m] 



>(C-2) 



d T (u,v c 



< 



cxp 



-((C-2)d T (^ c )/t) 2 



,2 (EiezjeM Var (^ I + (<?- 2)(d r (u, « c )A)M/3j 
Since E[Xy | £] > 0, (69) implies Af < Now, we can plug in this bound and (T74|) to write, 



E x *^ c 



d T (u,v c 



< 



cxp 



-{{C-2)d T {u,v c )/tf 



exp 



^d T (u,v c ) + (C - 2)(d T (« ! ^)A)(2^+ 1 A 2 )/3 
-t(C-2) 2 dr(u, Uc ) 



2(2/ 3 + 2 + (C-2)(2/ 3 + 1 )/3) 
-(C-2) 2 ( td T (u,v c ) 
6XP 1 (C - 2)/3 + 2 V 2/^+2 
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(C—2)' 2 

An elementary calculation shows that for C > 8, ^_ 2 ^g +2 > C, hence 



< exp f-(Ci/2 /3+2 ) d T (n, v c ) 



exp ( — C ( — h log 



log 2 



1 



1 



[Tog 2 (l/*)l 



&S+2 



exp — C 



2^+2 
1 



d T {u,v c ) 
1 



(It(u, v c 



Sinc e th ere exists a y G -P u ^ c \{w c } such that Tg(?/) 7^ 0, and for all c' G x(E)> K ( c ') > lj 
Lemma 4.3 implies that dT(u,v c ) > 2^ _1 , and for C > 8, we have ^-^ff^ > 1. Therefore, 



3x G F(P W „J : ||/i, a (x) - fMWl < (1 - Ce) dr(u,u c ) + £ ||/i, (O - /*,aO)||l 



3x G y(P TOUc ) : ^ Sjj(x) > Ced T (u,v c ) 



< 



X i:j >Ce(dT(u,v c )) 

iezje[m] 

exp ( — C 



flog 2 (l/5)l 



d T (u,v c 



completing the proof. 



□ 



The T a mappings. Before proving Lemma |5,7| we need some more definitions. For a color 



a G x{E), we define a map T a : V(T{a)) — > V{T{a)) based on Lemma 5.10 For u G V(j a ), we put 
T a (u) = u. For all other vertices u G V(T(a)) \ V("f a ), there exists a unique color b G p~ 1 (a) such 
that u G V(T(b)). We define r a (u) as the vertex w G V{P UVb ) which is closest to the root among 
those vertices satisfying the following condition: For all v G V(P UW ) \ {w} and k G Z, Tfc(f) 7^ 
implies 

2 k < jrK»j 

Clearly such a vertex exists, because the conditions are vacuously satisfied for w = u. We now 
prove some properties of the map r a . 

Lemma 5.11. Consider any a G x(E) an d u £ V(T(a)) such that T a (u) 7^ u. Then we have 
T a (u) = v c for some c G x(E(P U v a )) \ {«}• 

Proof. Let w G V {P u v a (u)) De such that T a (u) = p(w). The vertex w always exists because r a (u) G 
V(P U ) \ {u}. If x(w,r a (u)) 7^ x(r a («),p(r a (M))) then r a (n) is v c for some c G x(#(P u „J) \ {a}. 
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Now, for the sake of contradiction suppose that r o (u)) = x(^a(u),p(T a (u))). In this case, 
we show that for all v G P U p(T a (u)) \ {p(T a (u))}, and k G Z, 7>.(tt) / implies 

2 k < d T (u,p(T a (u))) 

e(ip(x(u,p(u))) - (f{a))' 

This is a contradiction since by definition of r a , it must be that T a (u) is the closest vertex to the 
root satisfying this condition, yet p(T a (u)) is closer to root than T a (u). 
Observe that, 

V(P up{ r a (u))) \ {p(T«(u))} = V(P ura{u) ) . 



We first verify (76) for T a (u) and k G Z with Tk(T a (u)) ^ 0. Since r a (u) G V(P U ), we have 

d T (u,T a (u)) < d T (u,p(r a (u))). (77) 



Recalling that p(w) = T a (u), by Lemma |4.5| for all k G Z, Tfc(r a (u)) < ifc(u>), therefore for all 
G Z, with Tfc(r a (u)) ^ 0, we have Tk{w) ^ as well, hence ([75]) implies 



2 , < d T (u,T a (u)) Ejf rf r (n,p(r a (n))) 

e(y(x(«,PC«))) - ~ e(ip(x(u,p(u)) - 99(a))' 



For all other vertices, v G V(-P u r a ( u )) \ {r a (^)}, and k G Z with Tfc(f) 7^ by (75) 



2 , < d T (u,r a (u)) dr( M ,p(r a (tx))) 

e(p(x(u,p(u))) - 99(a)) ~ e(p(x(u,p(u))) - <p(a)) ' 

completing the proof. □ 

Lemma 5.12. Suppose that a G x(P) an d u G V(T(a)). For any w G V(P u Y a ( u )), such that 
X{u,p(u)) = x{w,p(w)) we have T a (w) G V{P uTa{u) ). 

Proof. For the sake of contradiction, suppose that T a {w) ^ V(P u T a (u))- Since w G V(P U ), and 
T a (w) i V(P uTa(u) ), we have T a (w) G y(P ra(u) ), and 

d T {u,T a {u)) < d T {u,T a {w)). (80) 

Since w G ^(-P u r a («)) by assumption, for all vertices, we have V(P UW ) \{w} C y(P Mra ( u ))\{r a (u)}. 



Thus for all v G F(P UU) ) \ {w} and A; G Z with r fe (u) ^ by (75) 



2 fe < dr(n,r a (u)) ^ dr(M,r a (w)) 

e(^(x(n,^(it)))- ip(a)) ~ e(<p(x(u,p{u))) - (p(a)) ' 

The fact that w G V (-P u r a («)) a l so implies that dx(w, T a (w))) < dT(uT a (w))). Therefore, for 
all vertices v G V(P w r a ( w }) \ {F a (w)} and k G Z with Tfc(f) 7^ by ( J75| ), 



2 fc < dr(w,r a (w)) < d T (u,T a (w)) _ d T (u,T a (w)) 

^ z&ixiw^iw))) - (f(a)) ~ e(Lf(x(w,p(w))) - ip(a)) e((p(x(u,p(u))) - ip(a)) ' 

We have, 

V(P u r aM ) = V{P UW ) U {V(P wVaiw) ) \ {T a (w)}) . 
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Hence, by (81) and (82), for all v E V(P u r a (w)) \ {T a {w)} and k E Z, Tfc(w) 7^ implies 

d T {u,p(T a (w))) 



2 k < 



e(ip(x(u,p(u))) - ip(a))' 



(83) 



This is a contradiction to the definition of T a (u), since T a (u) must be the closest vertex to the root 
satisfying this condition, yet T a (w) is closer to root than T a (u). □ 

Defining representatives for j c . Now, for each c E x{E), we define a small set of representatives 
for vertices in ~f c . Later, we use these sets to bound the contraction of pairs of vertices that have 
one endpoint in j c . 

For a E x{E) and c E x(i?(T(a))) \ {a}, we define the set R a (c) C V(7 C ), the set of representa- 
tives for 7c, as follows 

[toga y l-l 



^a( c ) = l^J |w E V r (7 c ) : u is the furthest vertex 

from u c s.t. T a (u) / u and d(u,v c ) < 2~ l len(7 c )|. (84) 

The next lemma says when a vertex has a close representative. 

Lemma 5.13. Consider a E x(-E) and c E x(i?(T(a))) \ {a}. For a// vertices u E V(7 C ) wif/i 
r a (u) 7^ u there exists a w E i? a (c) swc/i i/iai, 

d T (u,v c ) <d T (w,v c ) < 2m&x(d T (u,v c ),5\en(j c )). 

Proof. Let i > be such that 

^^E (2-^,2-]. 
Ien(7 c ) J 

If i < |~log 2 t] — 1, then (84) implies that either u E R a {c), or there exists aro£ i? a (c) such that 



d T (u,v c ) < d T {w,v c ) < 



len(7 c 



< 2 o!t(u, v c 



On the other hand, if i > |~log 2 t] — 1, then (84) implies that either u E R a {c), or that there 
exists a u; E R a (c), such that 



len (7c) 

d T (u,v c ) < d T (w,v c ) < — y—- < 25len(7c), 



completing the proof. 



□ 



The following lemma, in conjunction with Lemma 5.13, reduces the number of vertices in V^c) 
that we need to analyze using Lemma |5.10| 

Lemma 5.14. Let (X, d) be a pseudometric, and let f : V —> X be a 1-Lipschitz map. For x, y E V , 
and x', y> E V{P xy ) and h>0,if d(f{x), f(y)) > d T (x, y) - h then d{f{x'), f(y')) > d T (x', y>) - h. 
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Proof. Suppose without loss of generality that drp{x',x) < dx{y' ,x). Using the triangle inequality, 

d(f(x'), f(y')) > d(f(x), f(y)) - d(f(x), f(x')) - d(f(y), f(y')) 

> (d T (x, y)-h)- d(f(x)J(x')) - d(f(y), f(y')) 

> dr(x, y) - d T (x, x) - d T (y, y) - h 
= d T (x',y') - h. 

□ 

The following lemma constitutes the inductive step of the proof of Lemma |5.7[ 



Lemma 5.15. There exists a universal constant C , such that for any color c G x(i?) U {x(r,p(r))}, 
the following holds. Suppose that, with non-zero probability, for all d G p _1 (c), and for all pairs 
x,y G V(T(c')), we have 

(l-Cs)d T (x,y)-6 Px (x,y;S)<J2\\hc>(x)-fi,c>(y)\\i<dT(x,y). (85) 

Then with non-zero probability for all x, y G V{T{c)), we have 

(1 - Ce) d T (x, y) - 5 p x (x, y;5)<^2 \\fU x ) ~ AMh < d T {x, y) ■ (86) 

Proof. Let £ denote the event that, for all d G /9 _1 (c), and all x,y G V(T(c')), we have 

d T (x, y)>J2 Ufa ( x ) " kd (V) II > (1 - Ce)d T (x, y) - 5p x (x, y; 5) . (87) 



We will prove the lemma by showing that, conditioned on £, (86) holds with non-zero probability. 
For x,y G V(T(c)) we define, 

n(x,y) = max{</?(a) : a G x{E) and x,y G V(T(a))} . 

Note that since x,y G V(T(c)), we have 

H{x,y)><p(c). (88) 

It is easy to see that if n(x, y) > y(c), then x, y G V(T(c')) for some d G p~ l {c). By construction, 
if d G p _1 (c) and x,y G V(T(c')), then 

HAc(x)-A c (y)|| = ||/^(x)-/^(y)||, 

hence £ implies that ( |86| > holds for all such pairs. Thus in the remainder of the proof, we need only 
handle pairs x,y G V(T{c)) with p(x,y) = (p(c). 

Write x(E(T(c))) = {ci,C2, . . . ,c n }, where the colors are ordered so that (f(cj) < (^(c J+ i) for 
j = 1,2, ...,n — 1. Let £\ = 24e, where the constant 24 comes from Lemma |5.10| And let 



£2 = 2- C'e, where C is the constant from Lemma 4.11 
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For i G [m], we define the event Xi as follows: For all j <i, and all x G ^(7cj) an d y G V(7 C .) 
with n(x,y) = 93(c), we have 

Y \\fk,c( x ) - /fc,c(y)[|l > d T (x,y) - eid T {x,y) - e 2 d T (T c (x),r c (y)) - 5p x (x,y;5). (89) 

For all pairs x G V{^ Ci ) and y G V(7c,-)i the event X m3X (ij\ implies, 

Y WfkA x ) - fk,c(y)h > d T (x,y) - (f?i + £2)^(2;, y) - 6p x (x,y;S). 



In particular this shows that for C = 2 ■ C + 24, if the events X\,Xz, • • • , X n all occur, then (86 ) 
holds for all pairs 1,1/6 V{T(c)). Hence we are left to show that 

Ppfi A • • • A X n I £ ] > . 

To this end, we define new events {Yi : i G [n]} and we show that for every i G [n], 

F e [Xi A ■ • • A Xi I X x A • • • A A Y;] = 1 , (90) 

and then we bound the probability that Yi does not occur by, 

Ff PT| < 2~ 3Mc ' ) - lp( - c))+1 . (91) 



By, Lemma 5.5 and the definition of fk, c (53), we have PgLXi] = 1. Since for all i G {2, . . . n}, 
Cj G x(£(T(c))) \ {c}, we have 

n 

P £ [Xi A--- AX n ) > 1- J> £ [^] 



i=2 



-3( V ( Cl )-^(c))+l 



i=2 



1 _ 2 • 2 (2 ~ 3) = 0, 



which completes the proof. 



For each i G [n], we define the event Yj as follows: For all j < i, and all vertices x G R c (ci) and 
y G ^(7^) with fJ,(x,y) = <p(c), we have 

2 W^A x ) - fUy)h - Y WffMx)) - fkMh > U - ^/ 2 ) d T (x,r c (x)) . (92) 



We now complete the proof of Lemma 5.15 by proving (90) and (91). 



Proof of (90). Suppose that X\,..., Xj_i and Yj hold. We will show that Xi holds as well. First 
note for all vertices in x, y G V(^ Ci ), by Lemma 5.5 and the definition of fk, Ci (53), we have 

d T (x,y) = Y \\fh,a(x) - fk,a{y)\\i = Y WfkA x ) ~ /fc,c(l/)][l, 

fcez 
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thus we only need to prove (89) for pairs x E V(7 Ci ), and y G V{^ Cj ) for with j < i and /i(x, y) 
(/?(c). We now divide the pairs with one endpoint in j Ci into two cases based on T c . 

Case I: x G Vila) with x ^ T c (x), and y G V(j c ) for some j < i, and p(x,y) = (p(c). 



In this case, by Lemma 5.13, there exists a vertex z G R c (ci) such that 
d(x,v Ci ) < d(z,v Ci ) < 2max(5\en(E(j Ci )),d T (x,v Cl )) 



If d(x,v c J < 5 len(£ , (7 c J), then by ([T8j), we have len(£'(7 c J) = p x (x, v Ci ; 5), hence 

d T (^,r c (z)) < d T (v Cl ,T c (z)) + 2max(5\en(E(-f Ci )),d T (x,v Ci )) 

< d T (v Cl ,T c (z)) + 2max(5 p x (x,v Ci ;5),d T {x,v Ci )) 

< d T (v Cl ,T c (z)) + 25 p x (x,v Ci ;5) +2d T (x,v Ci ) 

< 25 p x (x,v Ct ;5) + 2d T (x,T c (z)). 



(93) 




\ 

V 

I 



y 



+\r c (r c (z)) 




Figure 2: Position of vertices in the subtree T(c) for Case I. 



Since z G R c (ci), by definition we have r c (z) ^ z, therefore by Lemma 5.11, T c (z) = v c i for 



some color d G x{Pzv c ) \ {c}. The function ip is non-decreasing along any root leaf path, hence 
x(T c (z),p(T c (z))) = c e for some £ < i. 

We refer to Figure [2] for the relative position of the vertices referenced in the following inequal- 
ities. Using our assumption that X\, . . . , and Y{ hold, we can write 



Y, 



J2 H-M*) - fo,c(y)h > d T (T c (z),z) - {ex/2) d T (z,T c {z)) + £ ||/ fc)C (r c (z)) - f k , c 



X 



max(f,j) 

> d T (r c (z),z)-(e 1 /2)d T (z,T c (z)) 



+ d T (T c (z),y) - e 2 d T (r c (T c (z)),T c (y)) - e x d T (T c (z),y) - 5p x (T c (z),y 
>d T (y,z)-(e 1 /2)d T {z,r c {z)) 

- e 2 d T (T c (T c (z)), T c (y)) - ei d T (T c (z), y) - 5 p x (T c (z),y; 5) . 
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We may assume that E\ < 1, otherwise the statement of the lemma is vacuous. Using the preceding 
inequality, and applying Lemma 5.14 on pairs (z,y) and (x,y) implies that 

\\hc( x ) - fk,c(y)h > M x , v) - d T (z,r c (z)) 

- 82 d T (T c (T c (z)),T c (y)) ~ si d T (T c (z),y) - 5 p x (T c (z),y; 5) 



<h(x,y) - (ei/2) (2d T (x,T c (z)) + 25 p x {x,v Ci ; 5) ^ 
- e 2 d T (T c (T c (z)),T c (y)) - 8! d T (T c (z),y) - 5 p x (T c (z), y; 5), 

where in the last line we have used the fact that E\ < 1. 

We have x{ x iP{ x )) = x{ z iP{ z )) = c %- Moreover, since T c (z) / z, using Lemma [5. 11 it is easy to 

" d T {T(T c (z)),y) < d T {T c (z),y) < d T (F c (x),y), 



5.12 



check that x G P z r c (z)- Therefore, by Lemma 
and combining this with the preceding inequality yields, 

^||/fc,c(z)- A,c(y)||l ><h{x,ti) - (:,/2) (2 ,/,(.,-. 1\ (:)) + 2,5 ,, x (.r.r, ,:,)) 

- 8 2 d T (T c (x),T c (y)) - 8i d T (T c (z),y) - 5p x (F c (z),y; 5). 



k& 



Recall the definition of C(x,y;5) in (18). Since by Lemma 5.11, T c (z) = v c > for some color 
c' G X( p zvc) \{c}, w e have C(F c (z),y; 5) C C(v Ci ,y;5), hence p x {v Ci ,y;5) > p x {T c (z), y; 5) and 
thus, 

Y, WfkA x ) - fk,c(y)h > d T (x,y) - {e x /2) (2d T (x,T c (z)) + 25 p x (x,v Ct ;5)^j 
fcez 

- 8 2 d T (r c (x),T c (y)) - 8id T {T c {z),y) - 5 p x (v Cl ,y;5) 

> d T (x, y) - £\ d T (x, T c (z)) - 8 2 d T (T c (x),T c (y)) - 8id T (T c (z), y) 

- ^(px( v c l ,y;^) + £iPx( x ^ v c t ^)) 

> d T (x,y) - 8 1 d T (x,T c (z)) - 8 2 d T (T c (x), T c (y)) - 8id T {T c {z),y) 

~ 6(p x (x f v Ci ;S) + p x (v Ct ,y;5)), 

where in the last line we have again used that e± < 1. 

The set of colors that appear on the paths P X v Ci and P Vci y are disjoint, therefore p x (x,y;5) = 
p x {x,v Ct ;5) + p x (v Cx ,y;5), and 

^2\\fk,c( x ) ~ fk,c{y)h > dr{x,y) -eid T {x,T c (z)) - 8 2 d T {T c (x),T c (y)) - ei d T (T c {z), y) -5p x (x,y 
kez 

= dr{x, y) - ei d T (x, y) - 8 2 d T (T c (x),T c (y)) - 5p x (x, y; 5). 
Case II: x G V(7cj) with x = T c (x), and y G V{p/ Cj ) for some j < i, and p(x,y) = f(c). 



In this case, if x G V(7 C ) then the event Xj implies (89). On the other hand, suppose that 
x G V(T(c')) for some c' G p _1 (c). Recall that ^ = C'e, where C is the constant from Lemma 4.11 
By Lemma 4.11 (with c', x, and ^ substituted for c, v, and 8, respectively, in the statement of 
Lemma 



4.11), there exist vertices u,u' G {x} U {v a ■ a G x(E(Pxv c ,))} such that 

dT{x,u) < (82/2) dr{u\ u). 



(94) 
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and for all vertices z G V{P u 'u) \ { u> } an d for all fc£Z, 



Tki?) ± 



2 k < 



cIt(u, u') 



We have x( v c' iV{ v d)) = c i an d this condition is exactly the same condition as (75) for T c (u), 
therefore 

d T {x, u) < (ejj/2) *t(«', u) < (e 2 /2) d T (T c (u), u). (95) 

Note that, the assumption that T c (x) = x implies that, w/x and u = v a for some a G x{E{P u ,v c , ))• 
We have, 

E ii-M*) - /Wf)iii - E n/*.c(«) - -My) Hi ^ - E u/*» - /fc.c(«)iii 

fcez fcez fcez 



—cIt(x, u) 

d T (x, u) - e 2 d T (u, T c (u)) 
> dx(x, u) — e 2 dx{x, T c (u)) 
= d T (x,u) - e 2 d T (T c (x),T c (u)). 



(96) 



Since u = v a for some a G x{E{Pu,v c ,)), x{ u ->p{ u )) = c li fo r some I < i, and X max rg t j\ implies that, 
^2\\fk,c( u ) ~ fk,c{v)h > dr{u,y) - s 2 d T (T c (u),T c (y)) -eid T {u,y) - 6 p x (u,y; 8). 



Recall the definition of C(x,y;5) in ( Jl8[ ), We have u = v a for some a G x(i?(P Ui „^)), therefore 
C(it, y; (5) C C(x, y; 6), and p x (it, y; 5) < p x (x, y; 5). Now we can write, 

E WfkA u ) - fk,c{y)h ^ d T {u,y) - e 2 d T {T c (u),r c (y)) - e\ d T {u,y) - 6 p x (x,y;6). (97) 



Adding d96|) and (97) we can conclude that 



EH/^( X ) - /fc,c(y)lli > d T {u,y) + d T (u,x) -e 2 d T {T c (x),T c (u)) + d T (T c {u),T c (y)) 

-eid T (x,y) -Sp x (x,y;5) 
> d T {x, y) - e 2 d T (T c (x),T c (y)) - e\ d T (x, y) - 5 p x (x, y; 5), 



completing the proof of (90). 



Proof of (91). We prove this inequality by first bounding the probability that (92) holds for a 



fixed x and all y G V(j Cj ) (for a fixed j G {1, . . . , i — 1}) with p(x, y) = <p(c). Then we use a union 
bound to complete the proof. 

We start the proof by giving some definitions. For a vertex x G i? c (cj), let 

&x = \j € {1, ■ ■ • , % — 1}: there exists a v G V(7<y) such that p(x, v) = <p(c)\ . 
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And for a G S x , we define w(x;a) as the vertex v G V(7a) which is furthest from the root among 
those satisfying (J,(x,v) = <p(c). Finally for x G R c (a), we put 

/3a, = max {k G Z : 3z G P x r c (x) \ {r c (x)}, r fc (z) / 0} . 

Inequality (75) implies, 



2 A, < dr(a,r c (g)) 
e(v?(ci) - 99(c)) ' 



(98) 



By definition of i? c , for all elements x G R c (ci), we have r c (x) 7^ x. Moreover, by Lemma 5.11 



r c (x) = v c i for some d G xC^C^a:^)) \ {c}- Now, for x G R c {ci) and a £ S x we apply Lemma 5.10 
with £i/2 = 12e to write 



3y G P w{x - a) , Vc : WfkA*) ~ fkAv)\\l < (1 - £i/2)d T (x,r c (x)) + £ ||/ fciC (y) - / fejC (r c (x))||i 



< 1 exp ( _ 12 Mx,Tc(x)) 

~ [log 2 1/5] 6XP V 2fl-+a B 



exp(-3(y?(cj) - y(c))) 
[log 2 V<51 



(99) 



Note that, for all y G F(7 C J with /z(x,y) = <p(c), we have j/ G -P w ( x;a )^ c - 

By de finit ion of i? c (cj), |i? c (cj)| < [log 2 5 -1 ]. We also have (p(cj) < </?(cj) for j < i, and by 
Corollary 4.8, \S X \ < i < 2^~^ +1 . Taking a union bound over all x G R c (ci) and a G S x 
implies, 



(-3(^(ci) - <p(c))) 



1 



< (riog^- 1 ^)-^)^ ^^-1] «p(-3(y(q) - y(c))) 
= 2^)-^W +1 exp(-3(^(ci) - v(c))) • 
Since y(cj) > <^(c), by an elementary calculation we conclude that 

F s [?i] < 2 • 2- 3 (* , ( c 0-*'(c)) , 

which completes the proof of (l9l|). 



□ 



Finally, we present the proof of Lemma 5.7 



Proof of Lemma 5.7. Let C be the same constant as the constant in Lemma 5.15 For the sake of 
contradiction, suppose that 



Vx,yGF, (1 - Cs) d T (x,y) - 5 p x (x,y;5) < ||/i(a?) - < d T (x,y) 



0. 
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Now let c E x{E) U {x( r )P( r ))l De a color with a maximal value of (fi(c) such that, 



Vx, y G F(r(c)), (1 - Ce) d r (x, y) - 5 p x (x, y;5) <^ WfiM ~ hMh < M*, v) 



0. 



(100) 



For a G x(-E), «(a) > 0. Hence, for all c' £ p 1 (c), by (32), <^(c') > v?(c), and by maximality of 
c, for all c' G p~ l {c), we have 



:,!/ G V(T(c')), (1 - Ce) d T (x, y) - 5 p x (x, y; 5) < £ H/^^) - /i >c /(y)||i < d T (x, y) 



> 0. 



But now applying Lemma 5.15 contradicts (100), completing the proof. 



□ 
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